Change search
Link to record
Permanent link

Direct link
BETA
Dokoohaki, Nima
Publications (3 of 3) Show all publications
Hammar, K., Jaradat, S., Dokoohaki, N. & Matskin, M. (2020). Deep text classification of Instagram data using word embeddings and weak supervision. WEB INTELLIGENCE, 18(1), 53-67
Open this publication in new window or tab >>Deep text classification of Instagram data using word embeddings and weak supervision
2020 (English)In: WEB INTELLIGENCE, ISSN 2405-6456, Vol. 18, no 1, p. 53-67Article in journal (Refereed) Published
Abstract [en]

With the advent of social media, our online feeds increasingly consist of short, informal, and unstructured text. Instagram is one of the largest social media platforms, containing both text and images. However, most of the prior research on text processing in social media is focused on analyzing Twitter data, and little attention has been paid to text mining of Instagram data. Moreover, many text mining methods rely on training data annotated manually by humans, which in practice is both difficult and expensive to obtain. In this paper, we present methods for weakly supervised text classification of Instagram text. We analyze a corpora of Instagram posts from the fashion domain and train a deep clothing classifier with weak supervision to classify Instagram posts based on the associated text. With our experiments, we demonstrate that in absence of annotated training data, using weak supervision to train models is a viable approach. With weak supervision we were able to label a large dataset in hours, something that would have taken months to do with human annotators. Using the dataset labeled with weak supervision in combination with generative modeling, an F-1 score of 0.61 is achieved on the task of classifying the image contents of Instagram posts based solely on the associated text, which is on level with human performance.

Place, publisher, year, edition, pages
IOS PRESS, 2020
Keywords
Instagram, weak supervision, word embeddings, deep learning
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-271954 (URN)10.3233/WEB-200428 (DOI)000521596300004 ()2-s2.0-85082963198 (Scopus ID)
Note

QC 20200415

Available from: 2020-04-15 Created: 2020-04-15 Last updated: 2020-05-25Bibliographically approved
Jaradat, S., Dokoohaki, N., Wara, U., Goswami, M., Hammar, K. & Matskin, M. (2019). TALS: A framework for text analysis, fine-grained annotation, localisation and semantic segmentation. In: Proceedings - International Computer Software and Applications Conference: . Paper presented at 43rd IEEE Annual Computer Software and Applications Conference, COMPSAC 2019; Milwaukee; United States; 15 July 2019 through 19 July 2019 (pp. 201-206). IEEE Computer Society, 8754470
Open this publication in new window or tab >>TALS: A framework for text analysis, fine-grained annotation, localisation and semantic segmentation
Show others...
2019 (English)In: Proceedings - International Computer Software and Applications Conference, IEEE Computer Society, 2019, Vol. 8754470, p. 201-206Conference paper, Published paper (Refereed)
Abstract [en]

With around 2.77 billion users using online social media platforms nowadays, it is becoming more attractive for business retailers to reach and to connect to more potential clients through social media. However, providing more effective recommendations to grab clients’ attention requires a deep understanding of users’ interests. Given the enormous amounts of text and images that users share in social media, deep learning approaches play a major role in performing semantic analysis of text and images. Moreover, object localisation and pixel-by-pixel semantic segmentation image analysis neural architectures provide an enhanced level of information. However, to train such architectures in an end-to-end manner, detailed datasets with specific meta-data are required. In our paper, we present a complete framework that can be used to tag images in a hierarchical fashion, and to perform object localisation and semantic segmentation. In addition to this, we show the value of using neural word embeddings in providing additional semantic details to annotators to guide them in annotating images in the system. Our framework is designed to be a fully functional solution capable of providing fine-grained annotations, essential localisation and segmentation services while keeping the core architecture simple and extensible. We also provide a fine-grained labelled fashion dataset that can be a rich source for research purposes.

Place, publisher, year, edition, pages
IEEE Computer Society, 2019
Series
Proceedings - International Computer Software and Applications Conference, ISSN 0730-3157
Keywords
Annotations, Dataset, Deep learning, Fine-grained, Localisation, Natural language processing, Semantic segmentation, Word embeddings
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-262582 (URN)10.1109/COMPSAC.2019.10207 (DOI)2-s2.0-85072655269 (Scopus ID)9781728126074 (ISBN)
Conference
43rd IEEE Annual Computer Software and Applications Conference, COMPSAC 2019; Milwaukee; United States; 15 July 2019 through 19 July 2019
Note

QC 20191022

Available from: 2019-10-22 Created: 2019-10-22 Last updated: 2019-10-22Bibliographically approved
Jaradat, S., Dokoohaki, N., Hammar, K., Wara, U. & Matskin, M. (2018). Dynamic CNN Models For Fashion Recommendation in Instagram. In: Chen, JJ Yang, LT (Ed.), 2018 IEEE INT CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, UBIQUITOUS COMPUTING & COMMUNICATIONS, BIG DATA & CLOUD COMPUTING, SOCIAL COMPUTING & NETWORKING, SUSTAINABLE COMPUTING & COMMUNICATIONS: . Paper presented at 16th IEEE ISPA / 17th IEEE IUCC / 8th IEEE BDCloud / 11th IEEE SocialCom / 8th IEEE SustainCom, DEC 11-13, 2018, Melbourne, AUSTRALIA (pp. 1144-1151). IEEE COMPUTER SOC
Open this publication in new window or tab >>Dynamic CNN Models For Fashion Recommendation in Instagram
Show others...
2018 (English)In: 2018 IEEE INT CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, UBIQUITOUS COMPUTING & COMMUNICATIONS, BIG DATA & CLOUD COMPUTING, SOCIAL COMPUTING & NETWORKING, SUSTAINABLE COMPUTING & COMMUNICATIONS / [ed] Chen, JJ Yang, LT, IEEE COMPUTER SOC , 2018, p. 1144-1151Conference paper, Published paper (Refereed)
Abstract [en]

Instagram as an online photo-sharing and social-networking service is becoming more powerful in enabling fashion brands to ramp up their business growth. Nowadays, a single post by a fashion influencer attracts a wealth of attention and a magnitude of followers who are curious to know more about the brands and style of each clothing item sitting inside the image. To this end, the development of efficient Deep CNN models that can accurately detect styles and brands have become a research challenge. In addition, current techniques need to cope with inherent fashion-related data issues. Namely, clothing details inside a single image only cover a small proportion of the large and hierarchical space of possible brands and clothing item attributes. In order to cope with these challenges, one can argue that neural classifiers should become adapted to large-scale and hierarchical fashion datasets. As a remedy, we propose two novel techniques to incorporate the valuable social media textual content to support the visual classification in a dynamic way. The first method is adaptive neural pruning (DynamicPruning) in which the clothing item category detected from posts' text analysis can be used to activate the possible range of connections of clothing attributes' classifier. The second method (DynamicLayers) is a dynamic framework in which multiple-attributes classification layers exist and a suitable attributes' classifier layer is activated dynamically based upon the mined text from the image. Extensive experiments on a dataset gathered from Instagram and a baseline fashion dataset (DeepFashion) have demonstrated that our approaches can improve the accuracy by about 20% when compared to base architectures. It is worth highlighting that with Dynamiclayers we have gained 35% accuracy for the task of multi-class multi-labeled classification compared to the other model.

Place, publisher, year, edition, pages
IEEE COMPUTER SOC, 2018
Series
IEEE International Symposium on Parallel and Distributed Processing with Applications, ISSN 2158-9178
Keywords
Neural Pruning, Dynamic Computation Graph, Dynamic CNN, Image Classification, Text Mining, Fashion Recommendation
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-252671 (URN)10.1109/BDCloud.2018.00169 (DOI)000467843200155 ()2-s2.0-85063866491 (Scopus ID)978-1-7281-1141-4 (ISBN)
Conference
16th IEEE ISPA / 17th IEEE IUCC / 8th IEEE BDCloud / 11th IEEE SocialCom / 8th IEEE SustainCom, DEC 11-13, 2018, Melbourne, AUSTRALIA
Note

QC 20190603

Available from: 2019-06-03 Created: 2019-06-03 Last updated: 2019-06-03Bibliographically approved
Organisations

Search in DiVA

Show all publications