Internet,_web,_deepweb_and_darknets_illustrated

What is the deep web?

The deep web brings together all content created, hosted and exchanged on the web, but which is not indexed by search engines.

The deep web, in short, is a hidden web. But, to understand all this, we must first recall two other definitions: that of the Internet and that of the web.

What is the visible web?

The Internet is the network of networks, a space that links together private, public, non-profit networks, and which is completely decentralized. A series of standardized protocols allow it to be used to exchange information from almost anywhere.

If we imagine it in the form of an iceberg, a small part of this vast whole is above sea level: it is the web, also called the visible or surface web. the world Wide Web is a system of public pages interconnected between them (thanks to hypertext links) and indexed in the major search engines: it is one of the applications that exist on the infrastructure that is the Internet.

According Kasperskythis set of pages, easily accessible through services such as Google, Bing or DuckDuckGo, represents less than 5% of the total volume of content hosted and exchanged through the Internet.

What is the deep web?

Below the fold of the iceberg, there are a bunch of pages that cannot be accessed from a search engine, because these resources are not indexed for various reasons: it can be an instruction given by a site not to reference this or that page. It can also be pages that are only visible by logging in.

An example ? Your email inbox page is not on Google. Ditto for your bank account. However, these are indeed web pages, which you consult via a browser, but they cannot be found on a search engine. At best, you can come across a login page, and that’s it.

A representation of Internet applications in the form of an Iceberg. // Source : Wikimedia

Some of these hidden pages are seen daily by Internet users. Therefore, in the iceberg representation, they are often placed just below the waterline. They could be like the visible web, or the surface web, just like a regular site. Further down this iceberg structure are all the less accessible pages, information, data.

Why is there a deep web?

Surface web and deep web are terms intended to explain the role of search engines and the nature of pages, as this has an impact on the visibility of content present online. This deep web constitutes the point of convergence of several phenomena which do not necessarily have common points between them.

A web page which would be produced with a computer language incomprehensible by a search engine is likely not to be interpreted and, therefore, not to be referenced. It then falls into the category of the deep web. Ditto for the private parts of the websites, such as his messaging service or his bank account.

A web page containing instructions not to be listed by a search engine also falls into this deep web. Ditto for a page put online, but which has no incoming link to it. The indexing robots of Google and others are not likely to fall on it and offer it to Internet users, if they type in the right keywords.

Clearly, the deep web was born because search engines do not see everything from the web. The deep web is, therefore, a heterogeneous category. Some of these pages could be indexed, if certain parameters changed. Others are just not within reach, no matter what. There are also the limits specific to indexing robots, which do not identify everything.

How big is the deep web?

The deep web is vast, so much larger than the surface one, in fact, that it is almost impossible to accurately estimate its size.

In 2001, that is to say more than twenty years ago, a study conducted by Bright Planet established that the deep web was 400 to 550 times larger than the surface web, which itself was already colossal.

It is difficult to find recent estimates, perhaps because the very idea of ​​calculating the size of the complete web (deep and surface) has been abandoned in the face of the permanent expansion of the amount of information that we produce and put in network.

But if we take Kaspersky’s figure, we can talk about it in terms of proportion: at least 95% of the existing pages on the net would belong to the deep web.

What do we find in the deep web?

Among our most common uses, many of the digital places we visit underlie the existence of a thousand other pages to which we will never have access. Think, for example, of your bank, where you can access your account, but not the thousands of pages relating to the accounts of other customers. Or your account with an audio or video streaming service, where thousands of other users exist, each with their own profile, their own display, their own settings, their own usage data…

All these online spaces are either not detectable by search engines, or tell these engines not to list them, or are protected by different security measures, starting with passwords. There are mixed databases, health and legal records, or other sensitive components, company or university intranets… which can be consulted provided you know what you are looking for and to have the necessary access.

One can also find completely banal web pages, which would have their place on the surface of the net, but which use computer languages, tags or rules which take them off the radar of search engines.

The visible web is easily accessible with search engines.  // Source: Pixabay, editing with Canva
Visible web pages are easily accessible with search engines. // Source: Pixabay, editing with Canva

What is the deep web used for?

The deep web has no particular vocation. Its existence is only the result of a “lack of indexing”, or rather of a limit in the referencing of what is online. Its characteristic is that it simply reflects the various degrees of non-referencing and confidentiality of content on the web. Being in the deep web does not in any way mean that the content is necessarily illegal. It’s just content outside of Google or Bing. This is in no way illegal.

Leave a Comment

Your email address will not be published. Required fields are marked *