Early Smoke/Fire Detection with Vision Sensors in Smart Cities

*Important notice: This news reports on a paper which has been accepted and is awaiting peer review. Scientific Reports sometimes publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive or treated as established information.

A hybrid Vision Transformer (ViT) and YOLOv8 system detects smoke and fire in images with high reported accuracy and could be a way to enable faster visual fire monitoring in smart cities.

Fire is burning in a building and the sky is blue. The fire is very large and is causing a lot of smoke Study: An intelligent approach for early smoke/fire detection using vision sensors in smart cities. Image Credit: ARTpok/Shutterstock.com

Saving this for later? Grab a PDF here.

Fires can cause major infrastructure damage, loss of life, and disruption to essential services in smart cities. As a result, early detection is a priority, especially in outdoor Internet of Things (IoT) environments where smoke is often the earliest visible sign of danger.

Traditional heat, smoke, gas, and temperature sensors can issue late alerts and are often sensitive to surrounding conditions. They have been known to produce false alarms from harmless triggers such as dust or steam.

In fast-moving outdoor fires, these limitations can have devastating effects.

Computer vision has emerged as a promising alternative. Digital cameras, improved processors, graphics processing units (GPUs), and deep learning have made it more practical to detect smoke and fire directly from images and video in real time.

The Vision Transformer Hybrid Device

In the Scientific Reports paper, researchers developed a vision-based smoke and fire detection framework that combines a Vision Transformer with the YOLOv8 detection architecture.

The model divides the task between the two components: ViT acts as a global feature extractor, helping the system capture long-range spatial relationships and broader scene context. YOLOv8 serves as the real-time detection head, identifying and localizing fire and smoke regions within the image.

The framework was evaluated using the Fire and Smoke Dataset and the Forest Fire Smoke Dataset, with a combined total of 7,720 images drawn from rural and urban scenes.

Their goal was to improve both detection accuracy and speed, strengthen performance under variable lighting and changing smoke or fire appearance, and support real-time use in complex urban settings.

How YOLOv8 and the Vision Transformer Work Together

The framework uses ViT’s self-attention mechanism to capture visual patterns such as color gradients, texture, and intensity. According to the authors, this helps the model recognize subtle spatial relationships that conventional convolutional neural networks (CNNs) may miss.

Those extracted features are then passed to YOLOv8 for fast object localization and classification. The authors selected YOLOv8 because of its strong balance between speed and precision, making it suitable for real-time detection with low latency.

The model was trained on augmented datasets to improve generalization across different smoke densities, lighting conditions, and fire characteristics, reducing false positives and improving overall detection performance.

Results of 98+ % Across Metrics

The paper reports 98.5 % precision, 97.8 % recall, and an F1-score of 98.1 % for the proposed ViT-YOLOv8 model. It also reports an accuracy of 99.2 % in the abstract and conclusion, although one results section lists 99.6 %, indicating an internal inconsistency in the paper’s reporting.

The authors say the framework outperformed conventional CNN-based and YOLO-only approaches, with a reported 4.3 % gain in accuracy over comparison methods. They also report low inference latency and qualitative results showing accurate localization of smoke and fire regions.

Taken together, those findings suggest the model could support faster visual fire monitoring and improve emergency response planning in smart city settings.

The paper is careful to note that the system was tested mainly on controlled datasets under laboratory conditions, rather than in fully real-world urban deployments. The authors also acknowledge that performance may be affected by weather, occlusion, darkness, and thick fog.

Another limitation is that the framework relies on visual input alone. The researchers suggest that future systems could be strengthened by combining camera-based detection with thermal imaging and environmental sensors such as temperature and humidity sensors. They recommend testing on streaming video and developing lighter models for edge deployment.

Fire Detection in Smart Cities

The study points to a practical direction for early smoke and fire detection: pairing transformer-based scene understanding with fast object detection. The reported results are strong, and the framework appears well-suited to safety applications that depend on rapid visual analysis.

Journal Reference

Abozeid, A., Alanazi, R. (2026). An intelligent approach for early smoke/fire detection using vision sensors in smart cities. Scientific Reports. DOI: 10.1038/s41598-026-42762-y

Disclaimer: The views expressed here are those of the author expressed in their private capacity and do not necessarily represent the views of AZoM.com Limited T/A AZoNetwork the owner and operator of this website. This disclaimer forms part of the Terms and conditions of use of this website.

Samudrapom Dam

Written by

Samudrapom Dam

Samudrapom Dam is a freelance scientific and business writer based in Kolkata, India. He has been writing articles related to business and scientific topics for more than one and a half years. He has extensive experience in writing about advanced technologies, information technology, machinery, metals and metal products, clean technologies, finance and banking, automotive, household products, and the aerospace industry. He is passionate about the latest developments in advanced technologies, the ways these developments can be implemented in a real-world situation, and how these developments can positively impact common people.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Dam, Samudrapom. (2026, March 18). Early Smoke/Fire Detection with Vision Sensors in Smart Cities. AZoSensors. Retrieved on March 18, 2026 from https://www.azosensors.com/news.aspx?newsID=16797.

  • MLA

    Dam, Samudrapom. "Early Smoke/Fire Detection with Vision Sensors in Smart Cities". AZoSensors. 18 March 2026. <https://www.azosensors.com/news.aspx?newsID=16797>.

  • Chicago

    Dam, Samudrapom. "Early Smoke/Fire Detection with Vision Sensors in Smart Cities". AZoSensors. https://www.azosensors.com/news.aspx?newsID=16797. (accessed March 18, 2026).

  • Harvard

    Dam, Samudrapom. 2026. Early Smoke/Fire Detection with Vision Sensors in Smart Cities. AZoSensors, viewed 18 March 2026, https://www.azosensors.com/news.aspx?newsID=16797.

Tell Us What You Think

Do you have a review, update or anything you would like to add to this news story?

Leave your feedback
Your comment type
Submit

Sign in to keep reading

We're committed to providing free access to quality science. By registering and providing insight into your preferences you're joining a community of over 1m science interested individuals and help us to provide you with insightful content whilst keeping our service free.

or

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.