Home
World Journal of Advanced Engineering Technology and Sciences
International, Peer reviewed, Referred, Open access | ISSN Approved Journal

Main navigation

  • Home
    • Journal Information
    • Abstracting and Indexing
    • Editorial Board Members
    • Reviewer Panel
    • Journal Policies
    • WJAETS CrossMark Policy
    • Publication Ethics
    • Instructions for Authors
    • Article processing fee
    • Track Manuscript Status
    • Get Publication Certificate
    • Issue in Progress
    • Current Issue
    • Past Issues
    • Become a Reviewer panel member
    • Join as Editorial Board Member
  • Contact us
  • Downloads

ISSN: 2582-8266 (Online)  || UGC Compliant Journal || Google Indexed || Impact Factor: 9.48 || Crossref DOI

Fast Publication within 2 days || Low Article Processing charges || Peer reviewed and Referred Journal

Research and review articles are invited for publication in Volume 18, Issue 2 (February 2026).... Submit articles

Surf Shelter: A Big Data-driven Risk Assessment System Using Multi-label Classification

Breadcrumb

  • Home
  • Surf Shelter: A Big Data-driven Risk Assessment System Using Multi-label Classification

Aditya Karki 1, *, Ayesha Imam 2, and Navaraj Pandey 2 

1 Math and Computer Science, Fisk University, Nashville, TN 37208.
2 Computer Science, Fisk University, Nashville, TN 37208.

Research Article

World Journal of Advanced Engineering Technology and Sciences, 2025, 16(03), 373-385

Article DOI: 10.30574/wjaets.2025.16.3.1349

DOI url: https://doi.org/10.30574/wjaets.2025.16.3.1349

Received on 11 August 2025; revised on 16 September 2025; accepted on 19 September 2025

Web content moderation faces increasingly diverse threats ranging from phishing and malware to clickbait and fraud. This project, Surf Shelter, proposes a unified risk assessment system that utilizes multi-label classification to detect multiple types of website threats simultaneously. By leveraging big data from sources such as Common Crawl, OpenPageRank, GitHub, VirusTotal, PhishTank, and Google Safe Browsing, we compile a comprehensive dataset of websites and threat intelligence. We extract rich features using natural language processing (DistilBERT embeddings), static code analysis, and security heuristics, and apply an ensemble soft-voting strategy to label websites across threat categories. Preliminary results on 11,500 collected webpages (with 1,000 labeled) show that our model can achieve high overall accuracy (around 84%) in identifying malicious content, though calibration is needed to reduce false positives. An XGBoost classifier outperformed other models in consistency, and a Gaussian Mixture Model (GMM) helped adjust decision thresholds when soft-vote scores indicated misclassifications. The evolving cloud-deployed system demonstrates the feasibility of a one-stop, continuously updating platform for web risk detection. We conclude that a multi-label, data-driven approach can significantly enhance web content safety, and we outline future steps to integrate graph neural networks and deploy a user-facing extension for real-time protection.

Multi-label classification; Web threat detection; Big data analytics; Content moderation; Cybersecurity; Ensemble learning

https://wjaets.com/sites/default/files/fulltext_pdf/WJAETS-2025-1349.pdf

Preview Article PDF

Aditya Karki, Ayesha Imam and Navaraj Pandey. Surf Shelter: A Big Data-driven Risk Assessment System Using Multi-label Classification. World Journal of Advanced Engineering Technology and Sciences, 2025, 16(03), 373-385. Article DOI: https://doi.org/10.30574/wjaets.2025.16.3.1349.
 

Get Certificates

Get Publication Certificate

Download LoA

Check Corssref DOI details

Issue details

Issue Cover Page

Editorial Board

Table of content


Copyright © Author(s). All rights reserved. This article is published under the terms of the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as appropriate credit is given to the original author(s) and source, a link to the license is provided, and any changes made are indicated.


Copyright © 2026 World Journal of Advanced Engineering Technology and Sciences

Developed & Designed by VS Infosolution