Let's talk about osint (open source intelligence). How do people get all of your data? Well the answer is simple, when you agree to the TOS of applications you let companies harvest and sell your information. Crazy thing is that it's all publicly available.
So to start we'll begin with Shodan. Shodan is the search engine for the internet of things. Do you have a wifi enabled baby monitor? Well guess what, anyone can access it. https://www.csoonline.com/article/3276660/what-is-shodan-the-search-engine-for-everything-on-the-internet.html
Next let's talk about "theHarvester". It's one of the simplest tools to use on this list, theHarvester is designed to capture public information that exists outside of an organization’s owned network. https://github.com/laramies/theHarvester
The sources that theHarvester uses include popular search engines like Bing and Google, as well as lesser known ones like dogpile, DNSdumpster and the Exalead meta data engine. It also uses Netcraft Data Mining and the AlienVault Open Threat Exchange.
It can even tap the Shodan search engine to discover open ports on discovered hosts. In general, theHarvester tool gathers emails, names, subdomains, IPs and URLs.
In addition to IoT devices like cameras, building sensors and security devices, Shodan can also be turned to look at things like databases to see if any information is publicly accessible through paths other than the main interface.
It can even work with videogames, discovering things like Minecraft or Counter-Strike: Global Offensive servers hiding on corporate networks where they should not be, and what vulnerabilities they generate
Another freely available tool on GitHub, Metagoofil is optimized to extract metadata from public documents. Metagoofil can investigate almost any kind of document that it can reach through public channels including .pfd, .doc, .ppt, .xls and many others. https://github.com/laramies/metagoofil
The amount of interesting data that Metagoofil can gather is impressive. Searches return things like the usernames associated with discovered documents, as well as real names if available.
It also maps the paths of how to get to those documents, which in turn would provide things like server names, shared resources and directory tree information about the host organization.
This all just barely scratches the surface of what's available. There's so many more tools available and this is how people are getting doxxed.