Home » From email to phone number, a new OSINT approach

email2phonenumber logo

From email to phone number, a new OSINT approach

Lately I’ve been spending time researching weaknesses and attack vectors in password reset options. At BSides Las Vegas I presented a tool called “Ransombile”. It automates the password reset process over SMS for many Alexa top 100 websites and facilitates targeted attacks when having physical access to locked mobile devices for a short period of time. I’ve also talked about the wide impact of compromising voicemail systems at DEF CON and CCC by abusing password reset over phone calls.

While working on these topics, I spent many hours testing and resetting passwords in various different websites. At some point, I started noticing a pattern I hadn’t noticed before. When you want to reset a password, you enter the email and are then presented with different options. Those usually include receiving an email with a unique link to click on, getting an SMS with a secret six digit code or even the option to receive a call and hear the secret code instead.

While reviewing the option of resetting a password with either an SMS or phone call, I noticed that the UI usually shows part of the phone number. However it is masked in a way that it will reveal only a few digits, enough for the user to recognize which one in case he has multiple phones. In other words, if I know your email, I can initiate the password reset process for your accounts and obtain several digits of your phone number.

As mentioned above, I’ve spent a lot of time resetting passwords and I realized that not all websites reveal the same digits. Some would show the last four, some would show the first one, the last two and so on. There is no standard way to mask personal identifiable information (PII) such as phone numbers. The masking happens entirely at the developers discretion and that seemed like a problem to me.

Password reset shows 5 digits
Password reset shows 5 digits
2FA shows 3 digits
2FA shows 3 digits

To demonstrate to which extent this is the case, take Paypal for example. If I initiate the password reset process, it will reveal the first digit and the last four. But, if I login and get challenged with 2FA, it will reveal only the last three. This doesn’t make any sense. With only your email address, I can get five of your ten digits phone number. If I know your email and password, then I’ll only get three. Paypal hides more digits from an attacker that knows your password already than from one that only knows your email address.

Digging deeper

I made a list of popular websites that people tend to be registered on and checked their password reset process. My goal was to identify which sites would only ask for an email to initiate the process (no further information needed), supported mobile based password reset and number of digits “leaking”. Here is a small subset:

Leaks first three and last two digits:

  • eBay

Leaks first and last four digits:

  • Paypal

Leaks first and last two digits:

  • Yahoo

Leaks last four digits:

  • Lastpass

Leaks last two digits:

  • Google
  • Facebook
  • Twitter
  • Hotmail
  • Steam

If you look at the list above, we can conclude that, for example, if you have an eBay and a LastPass account, an attacker can know seven out of ten digits of your phone number. Just by knowing your email address. In other words, an attacker can use your email address to reduce the possibilities of guessing your phone number from one billion possibilities to one thousand.

This is not the only combination possible but let’s focus on this scenario in this post.

Discovering the remaining numbers

We have seven out of ten digits which means we are only missing three. At this point, it is important to focus on which numbers we know.

A US phone number is composed of 3 fields: area code (or NPA), exchange (or Central Office Code) and subscriber number (kudos to @jjarmoc who told me about the exchange and made me realize there was more to this than I thought). There is also the country code but we are focusing on US numbers for now.

US phone number fields
US phone number fields

eBay+LastPass gave us the area code and the subscriber number. It is important to highlight that we are not simply missing 3 digits, we are missing the 3 digits corresponding to the exchange. This is an important distinction as it will help us narrow down the possibilities even further.

NANPA

I put quite a number of hours researching and learning about exchanges. My main goal was to understand if I could reliably reduce the remaining thousand possible phone numbers by detecting exchange numbers not assigned to a specific area code.

Enter the North American Numbering Plan Administrator (NANPA). The numbering plan for the public switched telephone network for Canada, the US and its territories, and some Caribbean countries. This website is a goldmine! I learned so much about how the telephone systems work just from this source. Most importantly, I found exactly what I was looking for.

NANPA maintains an updated list of area codes and the corresponding exchanges that is publicly accessible. It is updated frequently and you can query the data or download a parseable file with all the information.

Exchanges in San Francisco
Exchanges in San Francisco

How useful is this? Well, let’s take San Francisco’s 415 area code. If I am only missing the three digits corresponding to the exchange, I have a thousand possible numbers for my target. By using NANPA dataset, I reduced it to 784 possible numbers because there are 216 exchange number not assigned to 415 area code. That is a reduction of over 20%, not bad!

But how good does it get? I played with different area codes and, for example, the Alaska 907 area code has only 625 exchanges assigned. That’s 375 phone numbers we don’t need to consider anymore by just using the valuable information that NANPA provides us. Or Tacoma’s 253 area code with only 458 exchanges. We got rid of over half the possible phone numbers.

What if the target only has a Paypal account? We know five out of ten digits. But again, which ones? We have the first digit of the area code and the last four random digits. Let’s imagine that you know the target is from California. Thanks to NANPA, we know all the area codes corresponding to California. There are only two area codes in California that start with 2, 213 and 209. Other two that start with 3, two that start with 4, etc. By knowing the first digit of the area code, you can still infer the first three digits of the phone number fairly easy.

National Pooling Administration

But how about if the target only has an eBay account? Or Paypal + Google? We have the area code and the last two digits of the subscriber number. Again, let’s focus on which numbers we know. I discussed above how we can use NANPA’s public records to narrow down possible numbers based on the area code and the exchange. Are there any public records that can help us discard invalid phone numbers based on the subscriber number? Yes! Thanks to the national pooling administration.

Ready? Number pooling is a way to assign smaller blocks of numbers (in the thousands) to growth areas. Historically, a phone number is a way of rooting a call to a person in a physical location. Take 415-272-XXXX. The first 3 digits narrow it down to a wider area like San Francisco, the 272 exchange is specific to Sausalito and the missing 4 digits specify the actual person (subscriber) in that limited area. Because carriers own the specific area code + exchange, this means that there area 10000 phone numbers assigned to Sausalito residents that have a plan with AT&T (the carrier owning 415-272).

As of 2017, Sausalito has 7110 residents. This means that from the 10k available numbers, only 7k will be used, and that if everyone is an AT&T customer. With this way of assigning numbers, many are going to waste and will not be used.

The irruption of cable modems and VOIP services which made it easier to become a carrier worsened the problem. The FCC decided that numbers should be assigned in smaller blocks in growth areas. Specifically in blocks of thousand numbers rather than ten thousand. Therefore, blocks of numbers would be assigned as XXX-XXX-X to carriers, including the first digit of the subscriber number.

The national pooling administration is responsible for managing it and has public records of the assigned blocks, including the subscriber digit. We can use this data to further discard invalid numbers. For example, taking our Sausalito number 415-272-XXXX which is missing the last 4 digits, we can use the public records to discard phone numbers like 415-272-[0-8]XXX and focus just on numbers which subscriber starts with 9. In other words, we have reduced the possible valid phone numbers from 10 thousand to one thousand.

9th block is the only one assigned
9th block is the only one assigned

Still… many possible numbers remaining

You have your target’s email address who happens to be from Tacoma and has an eBay and LastPass account. You initiate the password reset process and harvest seven out of ten digits of his phone number. Now, you can use NANPA to get rid of 542 and reduce the list to 458 possible numbers assigned to that email address. Then you use the National Pooling Administration to check if the block number yo have is assigned to the different possible exchanges reducing the possible valid phone numbers to 445.

What now? It is still a fair amount of phone numbers. I would claim that reducing one billion possible phone numbers down to 445 only knowing an email is pretty significant. The remainder could even be tested manually. But the goal is to reduce the possibilities as much as possible before attempting any manual verification. Let’s go back to the drawing board!

There is a number of ways you could take the remaining phone numbers and see if they are somehow linked to the email address. Using search engines with well defined search flags to try find clues in case the target left his phone number in a forum, website, etc. Look the email up by phone number on online services like pipl, BeenVerified or Spokeo that have huge databases with people’s personal information. You could even use telephone system online services that allow you to reverse search the owner of a phone by its number. Basically, a phonebook in reverse. I was actually pretty shocked by the amount of personal information I was able to get from services like WhitePages via it’s Twilio add-on by just providing a phone number and pay ten cents.

These options are good but not 100% reliable. You may not find anything in search engines, online data farms don’t have your target’s phone number and WhitePages tends to be somewhat outdated, many times it just doesn’t have the information you need. So, I started to think of new ways I could reliably obtain the phone number assigned to an email address.

Reusing the same attack vector, in reverse

It did not take long to get to that sweet “Eureka” moment. I reflected on the steps I took to get this far. I was abusing the password reset function of online services to collect a few phone number digits assigned to an email…

Hmmm.. I reset the password putting an email… and I get a few digits back. Can I… reset the password by entering a phone number and get a few email characters back?

Amazon password reset using phone number
Amazon password reset using phone number

Eureka! Turns out, there are popular services, like Amazon and Twitter, that allow you to reset the password by entering a phone number and get an email to complete the process. Most importantly, it will display a few characters of the email address it will send the link to. In Amazon’s case, you get the first and last letter of the username and the full domain. You also get the length of the username as the number of * matches the number of masked chars.

Twitter shows you the first two characters of the username and the first one of the domain. You also will know the length by counting the asterisks.

The attack vector looks like this:

1. Use the target’s email address to initiate the reset password process in multiple sites to harvest several phone number digits

2. Reduce the possible phone number list by discarding non-existing area codes, exchanges and subscriber numbers using NANPA and the National Pooling Administration publicly available data

3. Initiate the password reset process iterating over the remainder phone number list and correlate the leaked email chars against the target’s email address

By following these steps, you will be able to obtain the full ten digit phone number associated to the email address, without having to make one single call! Just by abusing password reset options and bruteforcing efficiently using publicly available information.

Automation

The attack vector above can be done manually. You can use services like namechk to pinpoint where to go harvest digits. Look at NANPA’s data yourself to discard invalid numbers. You could even bruteforce the remaining phone numbers to find the matching email using web proxy features like Burp’s intruder. But you don’t need to, I wrote a tool that will do all this for you.

email2phonenumber is a tool that allows you to provide a partial phone number and get a list of all the possible valid phone numbers, eliminating non-existing area code and exchange numbers. The tool will also let you bruteforce phone numbers using Amazon’s and Twitter’s password reset feature and correlate the masked emails against the one you provided looking for a match. It will attempt to fly under the captcha radar by replicating user behavior and randomizing some parameters in the requests. It also supports the use of proxy servers. What we are doing is starting the password reset for different phone numbers. This means that the services cannot detect you based on a specific phone number you are hammering on.

There are multiple other services that allow password reset using phone numbers that can be used for the same purpose. The tool supports Amazon and Twitter for bruteforcing. The idea is to get support from the community through pull requests to support additional ones.

You can find the tool in my github repo.

Demo

phonerator

email2phonenumber is a great tool but much of what it does can be done with tools like Burp or wfuzz. The true power relies on the gathering, parsing and use of the public available data related to a country’s phone numbering plan.

Therefore, I am working on a new online service that would allow you ti generate lists of possible phone numbers. It will have multi-country support, it will give you much more information and details, it will have historical records and most importantly, advanced filters.

Say you know the target has AT&T, you can filter by carrier and reduce the list of possible numbers even more. You may have intel that the person is from California, phonerator will take into account only those area codes. Maybe you know that the target had the phone number for over two years, so let’s discard exchanges and block numbers that were assigned recently.

I am still working on the tool and collecting all data. Please stay tuned for updates and release dates on my twitter account. It’s finally here!

Other countries

So far I’ve focused on US phone numbers but there are additional issues I want to highlight when considering targets from other countries. I am from Spain myself and Spanish mobile phone numbers have interesting properties as well. For starters, all mobile phone numbers start with the digit 6 (and recently 7). Also, phone numbers are only 9 digits long. Why is that important? Well, I know one digit from the beginning and services like eBay or LastPass do not adjust their mask to leak less digits for shorter phone numbers. Therefore, if my target is from Spain, just with LastPass I know 5 out of the 9 digits. That’s over half the digits.

I’ve observed the problem of using the same mask for all customers in other websites as well. My next step was to look for countries that had very short numbers. Take Iceland, Estonia or San Salvador with 7 digit phone numbers. All eBay customers from these countries have 5 out of 7 digits exposed to anyone that knows their email address. Combine it with LastPass accounts and you got yourself the target’s full phone number. Just by knowing their email address…

By understanding the properties of a country phone number system and taking advantage of websites that do not adjust their masking to leak less digits on shorter numbers, it is possible to harvest all digits of the phone number.

So what?

I showed you how to go from an email address to a phone number. So what? Is that really so bad? Well, we can answer this question from different angles like privacy and security but let’s list a number of attack vectors that originate from the knowledge of a target’s phone number:

  • SIM swapping. This is an issue that gained a lot of notoriety lately and it is more common than you might think. Attackers are able to port your phone number to a SIM in their control by different means like social engineering, extortion or even rogue carrier employees. This would allow the attackers to reset passwords on your behalf or bypass 2FA protections.
  • SS7 attacks. SS7 is a protocol used by carriers to interact among them. It is very old and researchers have demonstrated at multiple security conferences how they can track the location of individuals, or even spy on communications.
  • Targeting your voicemail. Check out my DEF CON talk to understand the impact of having an attacker compromise your voicemail. Seriously, voicemail systems are a threat.
  • Location tracking. Joseph Cox has done a great coverage of this issue.
  • CallerID spoofing. Plenty of online services allow you to spoof caller IDs. This is a great tool for social engineering.

Conclusions

The lack of a standardized way to mask PII leads to different approaches taken by online services. These are leaking partial information about your email address and phone number in places like the reset password area. It can be abused and automated to scrape pieces of information with the intent of reconstructing the targeted data.

It is possible, specially on targeted attacks, to obtain all digits of a phone number associated to an email address. Once the attacker is in possession of the phone number, he can use it for other attacks with serious impact that can lead to full account compromise, location tracking and spying.

This is specially true in countries with shorter phone numbers, as many services do not adjust their masking to the phone length.

Masking is not good enough, even if it just shows a couple of the phone number’s digits. Inhabitants of Saint Helena deserve the same security bar as the rest of us with their 5 digits long phone numbers. For emails, masking only the username is not sufficient, the domain can provide information about where that person may work. Even the TLD can give away if that person is a student or from which country they might be.

My proposal is to allow users to set labels. For example, the user can label an email address as “personal email” or the phone number as “work phone”. This way, when the password reset process displays a hint, it will show the label rather than tidbits of PII.

Users should not provide their phone number unless strictly required. Many online services ask for it but there is no real business need. They just want more information about you. If it is a requirement, consider using a virtual number like Google Voice or even a dedicated SIM that you only use for this purpose and never give the number away.

This is not just about OPSEC, it’s about keeping your data secure.

Responsible disclosure

I reached out to the online services that were showing more than 2 digits, especially if they were part of the area code or exchange. LastPass updated the mask to just show the last two digits corresponding to the subscriber number. eBay is showing now the first and last two digits, not great but better than it was before. Yahoo is still considering risks and mitigations as of this writing.

Paypal, which displays five digits including area code to anyone knowing the email address (but only three if the attacker knows the target’s password), decided this is working as designed and will not take action.

Resources

I presented this research at BSides Las Vegas and the Recon Village @ DEF CON

You can download the slides from slideshare

Exploding the Phone: The Untold Story of the Teenagers and Outlaws Who Hacked Ma Bell by Phil Lapsley

Fantastic book all around phone phreaking.

The privacy, security & OSINT show – Episode  111

This particular episode talks about the issues of providing your phone number to online services.

27 comments

  1. B says:

    Have you taken a crack at payfone? Don’t trust it’s cover. Advertisers are using it with zero necessary infrastructure security to protect pii to drive attribution and measures. Once you found a cve from 2016 on some ec2 instance from some negligence ad tech org you could get it all, billing address for their cell, email, idfa, ein.

  2. Striker says:

    In italy every phone number starts with a ‘3’. Every mobile phone operator got a range for the first three digits. For example, Wind (a popular operator in Italy) has got a typical ’32X’ or ’39X’. Not every number are used (there’s no ’30’ or ’31’).

  3. Anonymous says:

    In Germany phone numbers can be kept when changing providers. Also, landline numbers can be kept when moving anywhere else within the country. This does not make phone numbers random, but it would lead to plenty of false positives with this approach.

    Phone number possession is a consumer right for quite some time now, and the providers have changed their routing infrastructure for higher flexibility. We would need access to the current phone number routing tables, but they‘re confidential and only stored as local shards. It would be much easier to use the old ISDN protocol which supports knocking without costs, alas there‘s strong brute force protection behind it, and the protocol is not widely available anymore.
    …could have worked in the 90s, I guess; Ask CCC.

    Seems like not all is lost in good old Germany.

  4. Anonymous says:

    Hi!

    I have tried two email addresses. No way(((

    Traceback (most recent call last):
    File “C:\Users\User\AppData\Local\Programs\Python\Python311-32\Lib\site-packages\urllib3-1.26.12-py3.11.egg\urllib3\connection.py”, line 174, in _new_conn
    conn = connection.create_connection(
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File “C:\Users\User\AppData\Local\Programs\Python\Python311-32\Lib\site-packages\urllib3-1.26.12-py3.11.egg\urllib3\util\connection.py”, line 72, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File “C:\Users\User\AppData\Local\Programs\Python\Python311-32\Lib\socket.py”, line 961, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    socket.gaierror: [Errno 11001] getaddrinfo failed

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
    File “C:\Users\User\AppData\Local\Programs\Python\Python311-32\Lib\site-packages\urllib3-1.26.12-py3.11.egg\urllib3\connectionpool.py”, line 703, in urlopen
    httplib_response = self._make_request(
    ^^^^^^^^^^^^^^^^^^^
    File “C:\Users\User\AppData\Local\Programs\Python\Python311-32\Lib\site-packages\urllib3-1.26.12-py3.11.egg\urllib3\connectionpool.py”, line 386, in _make_request
    self._validate_conn(conn)
    File “C:\Users\User\AppData\Local\Programs\Python\Python311-32\Lib\site-packages\urllib3-1.26.12-py3.11.egg\urllib3\connectionpool.py”, line 1042, in _validate_conn
    conn.connect()
    File “C:\Users\User\AppData\Local\Programs\Python\Python311-32\Lib\site-packages\urllib3-1.26.12-py3.11.egg\urllib3\connection.py”, line 358, in connect
    self.sock = conn = self._new_conn()
    ^^^^^^^^^^^^^^^^
    File “C:\Users\User\AppData\Local\Programs\Python\Python311-32\Lib\site-packages\urllib3-1.26.12-py3.11.egg\urllib3\connection.py”, line 186, in _new_conn
    raise NewConnectionError(
    urllib3.exceptions.NewConnectionError: : Failed to establish a new connection: [Errno 11001] getaddrinfo failed

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
    File “C:\Users\User\AppData\Local\Programs\Python\Python311-32\Lib\site-packages\requests-2.28.1-py3.11.egg\requests\adapters.py”, line 489, in send
    resp = conn.urlopen(
    ^^^^^^^^^^^^^
    File “C:\Users\User\AppData\Local\Programs\Python\Python311-32\Lib\site-packages\urllib3-1.26.12-py3.11.egg\urllib3\connectionpool.py”, line 787, in urlopen
    retries = retries.increment(
    ^^^^^^^^^^^^^^^^^^
    File “C:\Users\User\AppData\Local\Programs\Python\Python311-32\Lib\site-packages\urllib3-1.26.12-py3.11.egg\urllib3\util\retry.py”, line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
    urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host=’fyp.ebay.com’, port=443): Max retries exceeded with url: /EnterUserInfo?ru=https%3A%2F%2Fwww.ebay.com%2F&gchru=&clientapptype=19&rmvhdr=false (Caused by NewConnectionError(‘: Failed to establish a new connection: [Errno 11001] getaddrinfo failed’))

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
    File “C:\Users\User\AppData\Local\Programs\Python\Python311-32\Scripts\email2phonenumber.py”, line 1019, in
    start_scrapping(args.email, args.quiet)
    File “C:\Users\User\AppData\Local\Programs\Python\Python311-32\Scripts\email2phonenumber.py”, line 538, in start_scrapping
    scrape_ebay(email)
    File “C:\Users\User\AppData\Local\Programs\Python\Python311-32\Scripts\email2phonenumber.py”, line 625, in scrape_ebay
    response = session.get(
    ^^^^^^^^^^^^
    File “C:\Users\User\AppData\Local\Programs\Python\Python311-32\Lib\site-packages\requests-2.28.1-py3.11.egg\requests\sessions.py”, line 600, in get
    return self.request(“GET”, url, **kwargs)
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File “C:\Users\User\AppData\Local\Programs\Python\Python311-32\Lib\site-packages\requests-2.28.1-py3.11.egg\requests\sessions.py”, line 587, in request
    resp = self.send(prep, **send_kwargs)
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File “C:\Users\User\AppData\Local\Programs\Python\Python311-32\Lib\site-packages\requests-2.28.1-py3.11.egg\requests\sessions.py”, line 701, in send
    r = adapter.send(request, **kwargs)
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File “C:\Users\User\AppData\Local\Programs\Python\Python311-32\Lib\site-packages\requests-2.28.1-py3.11.egg\requests\adapters.py”, line 565, in send
    raise ConnectionError(e, request=request)
    requests.exceptions.ConnectionError: HTTPSConnectionPool(host=’fyp.ebay.com’, port=443): Max retries exceeded with url: /EnterUserInfo?ru=https%3A%2F%2Fwww.ebay.com%2F&gchru=&clientapptype=19&rmvhdr=false (Caused by NewConnectionError(‘: Failed to establish a new connection: [Errno 11001] getaddrinfo failed’))

Leave a Reply

Your email address will not be published. Required fields are marked *