• Login
    View Item 
    •   DSpace Home
    • PROJECT REPORTS
    • SCHOOL OF COMPUTING SCIENCE & ENGINEERING
    • B.TECH
    • View Item
    •   DSpace Home
    • PROJECT REPORTS
    • SCHOOL OF COMPUTING SCIENCE & ENGINEERING
    • B.TECH
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Assertive Vision Using Deep Learning

    Thumbnail
    View/Open
    BT 4287_ Report.pdf (1.362Mb)
    Date
    2022-05-31
    Author
    Siddhant Singh Bhadauria, 18SCSE1010024
    Dharmendra Bisht, 18SCSE1010651
    Metadata
    Show full item record
    Abstract
    In this new era of technology, companies and developers around the world are talking about embracing artificial intelligence (AI), machine learning (ML), and deep learning (DL). Deep learning systems help a computer model to filter the input data through layers to predict and classify information. Assertive vision also focuses on artificial Intelligence and Machine learning resources and its other concepts for identifying the image and object on the basis of their attributes and features and then will provide caption to them and then the caption text which is generated will be converted to voice using API’s. Computer vision based assertive devices for the blind is promising and efficient technology and help the blind people in understanding the surrounding. The purpose of this model is to generate captions for an image. Image captioning aims at generating captions of an image automatically using deep learning techniques. Initially, the objects in the image are detected using a Convolutional Neural Network (InceptionV3). Using the objects detected, a syntactically and semantically correct caption for the image is generated using Recurrent Neural Networks (LSTM) with attention mechanism. Computer vision has become ubiquitous in our society, with applications in several fields. In this project, we focus on one of the visual recognition facets of computer vision, i.e. image captioning. The problem of generating language descriptions for visual data has been studied from a long time but in the field of videos. In the recent few years emphasis has been lead on still image description with natural text. Due to the recent advancements in the field of object detection, the task of scene description in an image has become easier. Computer vision has become ubiquitous in our society, with applications in several fields. In this project, we focus on one of the visual recognition facets of computer vision.
    URI
    http://10.10.11.6/handle/1/10304
    Collections
    • B.TECH [1324]

    DSpace software copyright © 2002-2016  DuraSpace
    Contact Us | Send Feedback
    Theme by 
    Atmire NV
     

     

    Browse

    All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister

    DSpace software copyright © 2002-2016  DuraSpace
    Contact Us | Send Feedback
    Theme by 
    Atmire NV