<!DOCTYPE article
PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.4 20190208//EN"
       "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="1.4" xml:lang="en">
 <front>
  <journal-meta>
   <journal-id journal-id-type="publisher-id">Virtual Communication and Social Networks</journal-id>
   <journal-title-group>
    <journal-title xml:lang="en">Virtual Communication and Social Networks</journal-title>
    <trans-title-group xml:lang="ru">
     <trans-title>Виртуальная коммуникация и социальные сети</trans-title>
    </trans-title-group>
   </journal-title-group>
   <issn publication-format="print">2782-4799</issn>
   <issn publication-format="online">2782-4802</issn>
  </journal-meta>
  <article-meta>
   <article-id pub-id-type="publisher-id">120371</article-id>
   <article-id pub-id-type="doi">10.21603/vcsn-2026-5-2-176-184</article-id>
   <article-id pub-id-type="edn">LBIBQV</article-id>
   <article-categories>
    <subj-group subj-group-type="toc-heading" xml:lang="ru">
     <subject>Прикладное использование результатов научной деятельности</subject>
    </subj-group>
    <subj-group subj-group-type="toc-heading" xml:lang="en">
     <subject>Applied use of the results of scientific activity</subject>
    </subj-group>
    <subj-group>
     <subject>Прикладное использование результатов научной деятельности</subject>
    </subj-group>
   </article-categories>
   <title-group>
    <article-title xml:lang="en">Web Application for Automated Metadata Corpus of Social Media Publications: Russian-Language and English-Language Travel Blogs about the Republic of Belarus</article-title>
    <trans-title-group xml:lang="ru">
     <trans-title>Разработка веб-приложения для автоматизации создания корпуса метаданных публикаций в социальных сетях (на материале русскоязычных и англоязычных тревел-блогов о Республике Беларусь)</trans-title>
    </trans-title-group>
   </title-group>
   <contrib-group content-type="authors">
    <contrib contrib-type="author">
     <name-alternatives>
      <name xml:lang="ru">
       <surname>Красовская</surname>
       <given-names>Юлия Юрьевна</given-names>
      </name>
      <name xml:lang="en">
       <surname>Krasowskaja</surname>
       <given-names>Yulija Yu.</given-names>
      </name>
     </name-alternatives>
     <email>hisradaar@gmail.com</email>
     <xref ref-type="aff" rid="aff-1"/>
    </contrib>
   </contrib-group>
   <aff-alternatives id="aff-1">
    <aff>
     <institution xml:lang="ru">Белорусский государственный университет (Беларусь, Минск)</institution>
     <country>Беларусь</country>
    </aff>
    <aff>
     <institution xml:lang="en">Belarusian State University (Belarus, Minsk)</institution>
     <country>Belarus</country>
    </aff>
   </aff-alternatives>
   <pub-date publication-format="print" date-type="pub" iso-8601-date="2026-04-10T04:54:26+03:00">
    <day>10</day>
    <month>04</month>
    <year>2026</year>
   </pub-date>
   <pub-date publication-format="electronic" date-type="pub" iso-8601-date="2026-04-10T04:54:26+03:00">
    <day>10</day>
    <month>04</month>
    <year>2026</year>
   </pub-date>
   <volume>5</volume>
   <issue>2</issue>
   <fpage>176</fpage>
   <lpage>184</lpage>
   <history>
    <date date-type="received" iso-8601-date="2025-12-14T00:00:00+03:00">
     <day>14</day>
     <month>12</month>
     <year>2025</year>
    </date>
    <date date-type="accepted" iso-8601-date="2026-03-30T00:00:00+03:00">
     <day>30</day>
     <month>03</month>
     <year>2026</year>
    </date>
   </history>
   <self-uri xlink:href="https://jcenter.kemsu.ru/en/nauka/article/120371/view">https://jcenter.kemsu.ru/en/nauka/article/120371/view</self-uri>
   <abstract xml:lang="ru">
    <p>В настоящее время одним из ключевых направлений в области анализа социальных медиа является извлечение и обработка метаданных для систематизации сведений о публикациях, авторах и вовлеченности аудитории. Целью исследования является разработка и тестирование веб-приложения OmniTrack, предназначенного для автоматизированного сбора корпуса метаданных русскоязычных и англоязычных публикаций о Республике Беларусь в тревел-блогах и их параметризации. Интеграция методов веб-­скрейпинга, многоуровневой архитектуры и модульного подхода обеспечивает масштабируе­мость, воспроизводимость и расширяемость системы при изменении внешних интерфейсов платформ. Серверная часть реализована на языке Python с использованием фреймворка Flask; для взаимодействия с пользователем создан веб-­интерфейс на HTML, CSS и JavaScript. Алгоритмы извлечения данных разработаны как независимые модули: для TikTok применена эмуляция браузера через undetected_chromedriver для обхода динамической отрисовки; для YouTube – библиотека yt_dlp для прямого получения JSON-метаданных; для Instagram – инструмент instaloader, обеспечивающий высокоуровневый доступ к объектной модели публикации. Собранные метаданные приведены к унифицированной схеме с сохранением в формате Excel при помощи библиотеки openpyxl, что обеспечивает удобство последующей статистической обработки. Приложение прошло юзабилити-­тестирование: 42 участника обработали более 400 публикаций, оценив простоту установки, скорость работы и интуитивность интерфейса; средняя оценка удобства составила 4,9 балла из 5; выявлены и устранены критические ошибки, включая несовместимость backend-­модуля pywebview и некорректную обработку сокращенных ссылок TikTok. Предложенное авторское веб-­приложение OmniTrack обеспечивает создание репрезентативного корпуса метаданных, необходимого для последующего анализа дискурсивных, жанровых и коммуникативных особенностей русскоязычных и англоязычных тревел-блогов о Республике Беларусь.</p>
   </abstract>
   <trans-abstract xml:lang="en">
    <p>Metadata extraction and processing are crucial for social media analysis as they help to systematize information about publications, authors, and audience engagement. The article introduces the web application OmniTrack, designed for automated collection and parameterization of metadata from Russian-language and English-language travel blogs. The application integrates web-scraping methods with a multi-layer architecture and a modular approach, which provides scalability, reproducibility, and extensibility even with changing external platform interfaces. The backend is implemented in Python (Flask); the frontend utilizes HTML, CSS, and JavaScript for an interactive user experience. Data-extraction algorithms are independent modules: undetected_chromedriver for TikTok’s dynamic rendering via browser emulation, yt-dlp for direct JSON-formatted metadata retrieval from YouTube, and Instaloader for high-level access to Instagram’s object model. Collected metadata are normalized to a unified schema in Excel format using the Openpyxl library, which facilitates subsequent statistical analysis. The application underwent usability testing: 42 participants processed 400 posts, evaluating installation simplicity, processing speed, and interface intuitiveness. The mean ease-of-use score was as high as 4.9 out of 5. Some critical issues were identified and resolved, including incompatibility of the pywebview backend module and incorrect handling of shortened TikTok links. The OmniTrack web application provides a robust framework for constructing a representative metadata corpus, supporting further linguistic research into the discursive, genre, and communicative features of Russian-language and English-language travel blogs.</p>
   </trans-abstract>
   <kwd-group xml:lang="ru">
    <kwd>веб-приложение</kwd>
    <kwd>метаданные</kwd>
    <kwd>социальные сети</kwd>
    <kwd>веб-скрейпинг</kwd>
    <kwd>автоматизация сбора данных</kwd>
   </kwd-group>
   <kwd-group xml:lang="en">
    <kwd>web application</kwd>
    <kwd>metadata</kwd>
    <kwd>social-media</kwd>
    <kwd>web scraping</kwd>
    <kwd>data automation</kwd>
   </kwd-group>
  </article-meta>
 </front>
 <body>
  <p></p>
 </body>
 <back>
  <ref-list>
   <ref id="B1">
    <label>1.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Жучкова С. В., Ротмистров А. Н. Автоматическое извлечение текстовых и числовых веб-данных для целей социальных наук. Социология: методология, методы, математическое моделирование. 2021. № 50-51. С. 141–183. https://elibrary.ru/xytjoy</mixed-citation>
     <mixed-citation xml:lang="en">Zhuchkova S. V., Rotmistrov A. N. Automatic extraction of text and numeric web data for social science purposes. Sociology: Methodology, Methods, Mathematical Modeling (AM), 2021, (50-51): 141–183. (In Russ.) https://elibrary.ru/xytjoy</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B2">
    <label>2.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Ahire V. Y. Assessing the effectiveness of metadata management systems in enhancing data governance: A primary study of IT and data-driven organizations. Management Journal for Advanced Research, 2025, 5(3): 85–90. https://doi.org/10.5281/zenodo.16792143</mixed-citation>
     <mixed-citation xml:lang="en">Ahire V. Y. Assessing the effectiveness of metadata management systems in enhancing data governance: A primary study of IT and data-driven organizations. Management Journal for Advanced Research, 2025, 5(3): 85–90. https://doi.org/10.5281/zenodo.16792143</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B3">
    <label>3.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Berman F., Rutenbar R., Hailpern B., Christensen H., Davidson S., Estrin D., Franklin M., Martonosi M., Raghavan P., Stodden V., Szalay A. S. Realizing the potential of data science. Communications of the ACM, 2018, 61(4): 67–72. https://doi.org/10.1145/3188721</mixed-citation>
     <mixed-citation xml:lang="en">Berman F., Rutenbar R., Hailpern B., Christensen H., Davidson S., Estrin D., Franklin M., Martonosi M., Raghavan P., Stodden V., Szalay A. S. Realizing the potential of data science. Communications of the ACM, 2018, 61(4): 67–72. https://doi.org/10.1145/3188721</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B4">
    <label>4.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Brown M. A., Gruen A., Maldoff G., Messing S., Zanderson Z., Zimmer M. Web scraping for research: Legal, ethical, institutional, and scientific considerations. ArXiv, 2024. https://doi.org/10.48550/arXiv.2410.23432</mixed-citation>
     <mixed-citation xml:lang="en">Brown M. A., Gruen A., Maldoff G., Messing S., Zanderson Z., Zimmer M. Web scraping for research: Legal, ethical, institutional, and scientific considerations. ArXiv, 2024. https://doi.org/10.48550/arXiv.2410.23432</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B5">
    <label>5.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Chani T., Olugbara O. O., Mutanga B. The problem of data extraction in social media: A theoretical framework. Journal of Information Systems and Informatics, 2023, 5(4): 1363–1384. https://doi.org/10.51519/journalisi.v5i4.585</mixed-citation>
     <mixed-citation xml:lang="en">Chani T., Olugbara O. O., Mutanga B. The problem of data extraction in social media: A theoretical framework. Journal of Information Systems and Informatics, 2023, 5(4): 1363–1384. https://doi.org/10.51519/journalisi.v5i4.585</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B6">
    <label>6.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Díaz de la Paz L., Crispí A. T., Mederos A. A. L. Model for the evaluation of metadata quality: Proposal for open science management in Cuba. Advanced Notes in Information Science, 2024, 6: 100–113. https://doi.org/10.47909/978-9916-9974-5-1.97</mixed-citation>
     <mixed-citation xml:lang="en">Díaz de la Paz L., Crispí A. T., Mederos A. A. L. Model for the evaluation of metadata quality: Proposal for open science management in Cuba. Advanced Notes in Information Science, 2024, 6: 100–113. https://doi.org/10.47909/978-9916-9974-5-1.97</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B7">
    <label>7.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Edara P., Pasumansky M. Big metadata: When metadata is big data. Proceedings of the VLDB Endowment, 2021, 14(12): 3083–3095. https://doi.org/10.14778/3476311.3476385</mixed-citation>
     <mixed-citation xml:lang="en">Edara P., Pasumansky M. Big metadata: When metadata is big data. Proceedings of the VLDB Endowment, 2021, 14(12): 3083–3095. https://doi.org/10.14778/3476311.3476385</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B8">
    <label>8.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Foerderer J. Should we trust web-scraped data? ArXiv, 2023. https://doi.org/10.48550/arXiv.2308.02231</mixed-citation>
     <mixed-citation xml:lang="en">Foerderer J. Should we trust web-scraped data? ArXiv, 2023. https://doi.org/10.48550/arXiv.2308.02231</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B9">
    <label>9.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Holom R.-M., Rafetseder K., Kritzinger S., Sehrschön H. Metadata management in a big data infrastructure. Procedia Manufacturing, 2020, 42: 375–382. https://doi.org/10.1016/j.promfg.2020.02.060</mixed-citation>
     <mixed-citation xml:lang="en">Holom R.-M., Rafetseder K., Kritzinger S., Sehrschön H. Metadata management in a big data infrastructure. Procedia Manufacturing, 2020, 42: 375–382. https://doi.org/10.1016/j.promfg.2020.02.060</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B10">
    <label>10.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Huang Y.-N., Munteanu V., Love M. I., Ronkowski C. F., Deshpande D., Wong-Beringer A., Corbett-Detig R., Dimian M., Moore J. H., Garmire L. X., Reddy T. B. K., Butte A. J., Robinson M. D., Eskin E., Abedalthagafi M. S., Mangul S. Perceptual and technical barriers in sharing and formatting metadata accompanying omics studies. Cell Genomics, 2025, 5(5). https://doi.org/10.1016/j.xgen.2025.100845</mixed-citation>
     <mixed-citation xml:lang="en">Huang Y.-N., Munteanu V., Love M. I., Ronkowski C. F., Deshpande D., Wong-Beringer A., Corbett-Detig R., Dimian M., Moore J. H., Garmire L. X., Reddy T. B. K., Butte A. J., Robinson M. D., Eskin E., Abedalthagafi M. S., Mangul S. Perceptual and technical barriers in sharing and formatting metadata accompanying omics studies. Cell Genomics, 2025, 5(5). https://doi.org/10.1016/j.xgen.2025.100845</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B11">
    <label>11.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Moreno-Ortiz A., García-Gámez M. Strategies for the analysis of large social media corpora: Sampling and keyword extraction methods. Corpus Pragmatics, 2023, 7: 241–265. https://doi.org/10.1007/s41701-023-00143-0</mixed-citation>
     <mixed-citation xml:lang="en">Moreno-Ortiz A., García-Gámez M. Strategies for the analysis of large social media corpora: Sampling and keyword extraction methods. Corpus Pragmatics, 2023, 7: 241–265. https://doi.org/10.1007/s41701-023-00143-0</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B12">
    <label>12.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Ohme J., Araujo T., Boeschoten L., Freelon D., Ram N., Reeves B. B., Robinson T. N. Digital trace data collection for social media effects research: APIs, data donation, and (screen) tracking. Communication Methods and Measures, 2024, 18(2): 124–141. https://doi.org/10.1080/19312458.2023.2181319</mixed-citation>
     <mixed-citation xml:lang="en">Ohme J., Araujo T., Boeschoten L., Freelon D., Ram N., Reeves B. B., Robinson T. N. Digital trace data collection for social media effects research: APIs, data donation, and (screen) tracking. Communication Methods and Measures, 2024, 18(2): 124–141. https://doi.org/10.1080/19312458.2023.2181319</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B13">
    <label>13.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Park J.-R., Tosaka Y. Metadata quality control in digital repositories and collections: Criteria, semantics, and mechanisms. Cataloging &amp; Classification Quarterly, 2010, 48(8): 696–715. https://doi.org/10.1080/01639374.2010.508711</mixed-citation>
     <mixed-citation xml:lang="en">Park J.-R., Tosaka Y. Metadata quality control in digital repositories and collections: Criteria, semantics, and mechanisms. Cataloging &amp; Classification Quarterly, 2010, 48(8): 696–715. https://doi.org/10.1080/01639374.2010.508711</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B14">
    <label>14.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Park J.-R., Tosaka Y., Maszaros S., Lu C. From metadata creation to metadata quality control: Continuing education needs among cataloging and metadata professionals. Journal of Education for Library and Information Science, 2010, 51(3): 158–176.</mixed-citation>
     <mixed-citation xml:lang="en">Park J.-R., Tosaka Y., Maszaros S., Lu C. From metadata creation to metadata quality control: Continuing education needs among cataloging and metadata professionals. Journal of Education for Library and Information Science, 2010, 51(3): 158–176.</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B15">
    <label>15.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Pretorius K. A simple and systematic approach to qualitative data extraction from social media for novice health care researchers: Tutorial. JMIR Formative Research, 2024, 8: 1–9. https://doi.org/10.2196/54407</mixed-citation>
     <mixed-citation xml:lang="en">Pretorius K. A simple and systematic approach to qualitative data extraction from social media for novice health care researchers: Tutorial. JMIR Formative Research, 2024, 8: 1–9. https://doi.org/10.2196/54407</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B16">
    <label>16.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Skluzacek T. J., Chen M., Hsu E., Chard K., Foster I. Models and metrics for mining meaningful metadata. International Conference on Computational Science. Computational Science – ICCS 2022: Proc. 22nd Intern. Conf., London, UK, 21–23 Jun 2022. Springer, 2022, 417–430.</mixed-citation>
     <mixed-citation xml:lang="en">Skluzacek T. J., Chen M., Hsu E., Chard K., Foster I. Models and metrics for mining meaningful metadata. International Conference on Computational Science. Computational Science – ICCS 2022: Proc. 22nd Intern. Conf., London, UK, 21–23 Jun 2022. Springer, 2022, 417–430.</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B17">
    <label>17.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Subramaniam P., Ma Y., Li C., Mohanty I., Fernandez R. C. Comprehensive and comprehensible data catalogs: The what, who, where, when, why, and how of metadata management. ArXiv, 2021. https://doi.org/10.48550/arXiv.2103.07532</mixed-citation>
     <mixed-citation xml:lang="en">Subramaniam P., Ma Y., Li C., Mohanty I., Fernandez R. C. Comprehensive and comprehensible data catalogs: The what, who, where, when, why, and how of metadata management. ArXiv, 2021. https://doi.org/10.48550/arXiv.2103.07532</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B18">
    <label>18.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Wilkinson M. D., Dumontier M., Aalbersberg I. J., Appleton G., Axton M., Baak A., Blomberg N., Boiten J.-W., da Silva Santos L. B., Bourne P. E., Bouwman J., Brookes A. J., Clark T., Crosas M., Dillo I., Dumon O., Edmunds S., Evelo C. T., Finkers R., Gonzalez-Beltran A., Gray A. J. G., Groth P., Grethe J. S., Mons B. The FAIR guiding principles for scientific data management and stewardship. Scientific Data, 2016, 3(1). https://doi.org/10.1038/sdata.2016.18</mixed-citation>
     <mixed-citation xml:lang="en">Wilkinson M. D., Dumontier M., Aalbersberg I. J., Appleton G., Axton M., Baak A., Blomberg N., Boiten J.-W., da Silva Santos L. B., Bourne P. E., Bouwman J., Brookes A. J., Clark T., Crosas M., Dillo I., Dumon O., Edmunds S., Evelo C. T., Finkers R., Gonzalez-Beltran A., Gray A. J. G., Groth P., Grethe J. S., Mons B. The FAIR guiding principles for scientific data management and stewardship. Scientific Data, 2016, 3(1). https://doi.org/10.1038/sdata.2016.18</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B19">
    <label>19.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Yang W., Fu R., Bilal Amin M., Kang B. The impact of modern AI in metadata management. Human-Centric Intelligent Systems, 2025, 5: 323–350. https://doi.org/10.1007/s44230-025-00106-5</mixed-citation>
     <mixed-citation xml:lang="en">Yang W., Fu R., Bilal Amin M., Kang B. The impact of modern AI in metadata management. Human-Centric Intelligent Systems, 2025, 5: 323–350. https://doi.org/10.1007/s44230-025-00106-5</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B20">
    <label>20.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Yulfitri A., Sensuse D. I., Ulum M. B., Achmad Y. F. Metadata management to accelerate Big Data implementation. Journal of Informatics and Communication Technology, 2025, 6(2). https://doi.org/10.52661/jict.v6i2.362</mixed-citation>
     <mixed-citation xml:lang="en">Yulfitri A., Sensuse D. I., Ulum M. B., Achmad Y. F. Metadata management to accelerate Big Data implementation. Journal of Informatics and Communication Technology, 2025, 6(2). https://doi.org/10.52661/jict.v6i2.362</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B21">
    <label>21.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Zachlod C., Samuel O., Ochsner A., Werthmüller S. Analytics of social media data – state of characteristics and application. Journal of Business Research, 2022, 144: 1064–1076. https://doi.org/10.1016/j.jbusres.2022.02.016</mixed-citation>
     <mixed-citation xml:lang="en">Zachlod C., Samuel O., Ochsner A., Werthmüller S. Analytics of social media data – state of characteristics and application. Journal of Business Research, 2022, 144: 1064–1076. https://doi.org/10.1016/j.jbusres.2022.02.016</mixed-citation>
    </citation-alternatives>
   </ref>
  </ref-list>
 </back>
</article>
