Every year the number of Internet users grows. By early October 2020, 4.9 billion people – that’s 63.2% of the earth’s population – were using the Internet.
The size of data on the Internet reached 2.7 Zettabytes (1 ZB ~ 1012GB). And each year – the number of users and devices connected to the network increases by 6% and 10% respectively. Much of this information is publicly available. Sources referring to this data or data from newspapers, magazines, radio and TV shows, and public government reports are called open sources. Finding information from such sources, selecting and gathering it, and then analyzing it further is an intelligence discipline called open-source intelligence (OSINT).

Where and why OSINT?

OSINT as a separate discipline originated in the United States in the 1940s together with the establishment of the Foreign Broadcasting Monitoring Service. Its staff recorded and analyzed the short-wave radio transmissions of foreign countries, after which the data were transmitted in the form of reports to the military and intelligence agencies. According to some CIA and Pentagon officials, the U.S. leadership received 70-90% of the data from open sources and only 30-10% from undercover sources. Today, open-source intelligence is used not only in government security and military agencies but also in commercial companies, analytical agencies, political organizations, etc.osint


With OSINT you can:

  • Get the most objective and useful information for decision-making;
  • Gain a competitive advantage for your organization or its product;
  • To find weaknesses and vulnerabilities in their security system and protection of confidential information about clients;
  • Understand the psychological features, needs, and habits of the target audience.

In the IT and information security industry,

OSINT helps:

  • Collect information about competitors and look for competitive advantages;
  • Analyze the security of the facility, and identify security vulnerabilities;
  • Find information leaks;
  • Identify possible threats, their sources, and targeting;
  • Analyze cybercrimes (data theft, hacking, etc.)

Where do you get the data?

Open-source intelligence involves obtaining data from sources in the public domain and those that can be accessed on-demand.
These include:

  • Information materials (articles, news, notes) in the media;
  • Scientific research published in specialized publications;
  • Books – encyclopedias, reference books, memoirs, etc.;
  • Posts and comments on social networks;
  • Information from the census;
  • Documents from public state and non-state archives;
  • Public commercial data (income, profit, loss, growth, stock value, etc.)
  • Results of public surveys;
  • Data from remote sensing satellites and aerial photography aircraft;
  • Police and court records, and other sources.

The challenges of OSINT today

New sources of information appear very quickly. For every 5 sources that disappear, 10 new ones appear. It is necessary to keep the data up to date, which increases its value in use. A large number of sources disappear because of stricter computer security and privacy regulations.

Another difficulty is the limited functionality of information retrieval tools. For the most part, one open-source program is not enough to search, and they are not scalable. And enterprise tools have too many non-customizable components. Successful information retrieval requires knowledge of different tools and the ability to find new ways to retrieve information.

What Separates OSINT from Intelligence and Espionage

The collection and analysis of information in the public domain are not contrary to international law, nor the laws of most nations, although some sources and methods of investigation may be on the edge of legality. Industrial or commercial espionage uses illegal methods and tools to obtain information, such as bribery and blackmailing members of a rival organization, unauthorized access to closed databases, theft of trade secrets, etc.osint

Any organization or even an individual can monitor and analyze publicly available sources without the use of specialized equipment or “connections” in state security agencies.

OSINT in information security

With the development of the Internet, the focus of analysts’ attention has shifted to cyberspace as one of the main sources of information. Here useful data may include:

  • Registration information about the certificate or domain of the site;
  • Open personal data of users (username, e-mail addresses, phone numbers);
  • User activity on social networks (posts, comments, etc.);
  • User search engine queries;
  • HTML code of the site;
  • Public text, graphics, audio, video files and their metadata (e.g. date, time and place of creation, a device used);
  • Geolocation data and other types of information.

Many data can be accessed via the open Internet through resources indexed by search engines. However, sources from the “deep web”, which ordinary users do not have access to because they need to pay for them, also fall under the definition of open source. In other words, OSINT works with all data that is not confidential or a trade or state secret.

The main stages of intelligence

Let’s look at the process of conducting reconnaissance given in Michael Buzzell’s book.

1. First, you need to make a research plan or define a goal.

2. Prepare the equipment and programs necessary to solve the problem.

3. Perform a search on all available identifiers.

4. Collect information.

5. Analyzing the obtained data.

6. Preparation of the conclusion and results.

7. Archiving or cleaning the equipment.


Despite its long history, open-source exploration is just beginning to develop, due to the explosion of information. This field is a sought-after type of materials research. For those who use it, it opens up additional information: for employees – an opportunity to learn more about a future employer, and for businesses – an analysis of the market and customers. But on the other hand, everyone should understand what information is worth reporting publicly and understand what can become public information.