Understanding the importance of OSINT in modern research
As the world steadily moves toward digitalization, the global volume of digital data is increasing at an explosive rate. In 2024, the international data volume reached 149 zettabytes, with projections indicating a surge to 181 zettabytes by 2025. Nearly 90% of this data was generated within the past two years, with unstructured data comprising 80% of the total volume.
Digitization opens numerous opportunities for businesses to increase productivity, enhance business efficiency, cut operational costs, and speed up access to information. A large volume of this data belongs to people, such as data on social media platforms and government public records. Knowing how to use public data becomes very important to support different intelligence needs in the private and public sectors.
In this article, I will discuss online techniques to support modern research methods. Before we start, let's introduce the concept of open source intelligence (OSINT) and see how it has become critical to supporting modern online research methods.
What is OSINT, and what are its primary sources?
OSINT refers to the set of methods, tools, online services, and techniques used to acquire data from publicly available sources, mainly the internet.
Although most OSINT data is acquired from the internet, other sources can provide critical intelligence for researchers. In general, OSINT data can be acquired from the following sources:
- Internet: This is the largest source for OSINT data. It includes everything published online that can be accessed for free. Examples include public content on social media platforms, data accessed via conventional search engines, discussion forums, blogs, user-generated media such as videos and images, and deep web resources like academic databases and non-indexed content
- Traditional media outlets: Such as papers, magazines, newspapers, radio and broadcasts, and road advertisements
- Government data: Such as public records (vital records), property records, criminal records, regulatory filings, and anything published by government agencies to the public
- Academic publications: This includes academic dissertations, academic journals, and theses
- Commercial data: This includes data acquired from commercial satellites, financial records, SEC filings, annual reports, and data residing behind a paywall (requiring payment to access)
- Professional networks: Specialized platforms listing people’s and companies' information, such as LinkedIn, ResearchGate, and industry-specific forums that contain professional insights and connections
- Grey literature: This includes different contents that require payment to access them, such as specialized journals, books, whitepapers, business documents, technical reports, and preprints
It is worth noting that some OSINT research requires combining data acquired from different sources, such as the internet and grey literature.
Data validation in OSINT
Data validation and verification are important aspects of OSINT research. For instance, OSINT researchers must validate their findings using multiple sources to ensure accuracy. Cross-referencing data from government records against commercial databases and academic publications will boost research reliability and ensure outcomes have a solid basis. To maintain research integrity, digital artifacts should also undergo timestamp analysis and source verification.
How OSINT is used in modern research
OSINT is crucial in modern research as it allows researchers to leverage publicly available data to gather actionable intelligence from various data sources for almost no cost.
Here are the key methods of how OSINT is leveraged in modern research:
Social media analysis
Analyzing social media platforms' content is an important element of OSINT. It now has a dedicated branch within online research called Social Media Intelligence (SOCMINT).
Analyzing content on social media websites helps us identify:
- Individual profiling: Researchers can understand individuals' interests, beliefs, and online behavior by analyzing posts on major social media platforms like Facebook, Instagram, and X. They can also identify relationship networks, track location patterns through geotags and check-ins, and analyze temporal posting habits to establish daily behavioral habits
- Monitoring trends and events – Tracking popular hashtags, mentions, and engagement actions on major social media platforms enables the identification of trending topics and emerging situations in particular regions.
- Public opinion analysis – Through sentiment analysis of social media posts over specific time frames or geographical locations, researchers can understand the public response to government policies, products, or brands.
Metadata analysis
Digital files gathered through OSINT contain embedded metadata that provides crucial intelligence. Examples of metadata elements include:
- File creation and modification attributes
- System information and software versions used
- Geographic coordinates from images and video files
- Device identifiers and user accounts
- Edit history and document revisions
Website analysis
Technical analysis of websites reveals operational infrastructure such as:
- Domain registration history and ownership records – via the WHOIS database
- SSL certificate data and hosting providers
- Technology stack identification through HTTP headers
- Subdomain enumeration for identifying internal services such as VPN and email portals
- Web application frameworks such as content management system (CMS) versions
- Historical snapshots from web archives – such as the Wayback Machine
Geolocation intelligence
IP address tracking enables:
- Physical server location
- VPN exit node identification
- Network infrastructure mapping
- ASN and BGP route analysis
- Traffic flow patterns
Email analysis
Email header analysis reveals:
- Mail server configurations
- Delivery path and routing information
- Authentication mechanisms (SPF, DKIM, DMARC)
- Client software identifiers
- Original sending IP addresses
- Temporal patterns in communication
Dark web monitoring
Research on criminal activities on darknets (such as TOR, I2P, Freenet) includes:
- Monitoring of illicit marketplaces such as online markets used to sell drugs, arms, and fake documents
- Cryptocurrency transaction tracking
- Forum communications analysis
- Data leak identification
- Threat actor profiling
OSINT has introduced radical changes to modern research methods by providing researchers with powerful tools and techniques to gather intelligence from publicly available sources. The combination of advanced search techniques, social media analysis, metadata extraction, and dark web monitoring enables comprehensive data collection and analysis.
As digital data proliferates, mastering OSINT search techniques becomes crucial for researchers across various sectors. Whether analyzing market trends or conducting security assessments, OSINT provides cost-effective solutions for gathering actionable intelligence. Still, researchers must maintain rigorous data validation practices to ensure the reliability and integrity of their findings.
Subscribe to the Barracuda Blog.
Sign up to receive threat spotlights, industry commentary, and more.