On Episode 7 of The Edge of Innovation, we continue our conversation on what it means to be anonymous, and look at why it is not possible.
Author: Paul
Anonymity – Part 1
On Episode 6 of The Edge of Innovation, we are going to be talking about what it means to be anonymous without an identity in the digital age. Paul is going to be helping us think through what that means and what it is.
Transcript
Sections
Introduction: What does it mean to be anonymous?
Identification by payment method
Anonymity in the past vs. now
Cell phones and normal behavior
Browsing and IP proxies: Peeling back the onion
Fingerprinting your profile
Introduction
Paul: This is the Edge of Innovation, Hacking the Future of Business. I’m your host, Paul Parisi.
Jacob: And I’m Jacob Young.
Paul: On the Edge of Innovation, we talk about the intersection of between technology and business, what’s going on in technology and what’s possible for business.
Introduction: What does it mean to be anonymous?
Jacob: Paul, can you talk us through what does it mean to be anonymous today?
Paul: Sure. If we look at the definition – I didn’t look it up before this – but it has different meanings to different people. It’s similar in the way that somebody says, “Well, I want a secure server.” Or “I want a secure system.” Or “I want my stuff to be secure.” It’s like, well what do you mean by that?
Because security has a different meaning to everybody as well. There’s towns in America that feel that they’re secure, and they leave their doors open. They don’t even have locks on them. So, they’re secure. The… So, it’s really a perception.
Identification by payment method
So, anonymous… Again, what does that mean? And if we take it to the extreme, that somebody doesn’t know who I am, and can’t figure that out, that may be true anonymity. So, for example, if you take a dollar bill, and you go to a store, and you buy something, forget about any technology around that, you’re anonymous. There is really no way to know that you had that dollar bill with that serial number on it. It does have a serial number, which is interesting. Why does it have a serial number and they’re unique? It implies some scarcity from the government.
So, if you walk into a convenience store and take a dollar out and pay for a candy bar, there is no way, through that dollar, to track who you are. Very differently than if you take a credit card out and you buy a candy bar. Now all of a sudden, I have… I know exactly who you are. Certainly the credit card company does – Visa, American Express. They know that you were there at that day. They may even know the time. They don’t really know what you bought, but you were there.
If there was a robbery around on that day, one of the things that we could do is if we had an electronic payments and that’s all we accepted… Somebody came in… Well, the robber probably wouldn’t have used a credit card.
So, let’s just think about this. How do you get anonymity? So, let’s take it this way. You go to the bank. And you ask for a dollar out of the ATM and the bank carefully scans the number of that dollar and says, “I am giving that to Jacob.” And it has a photograph of you, so it does a face identification of you, and it knows that Jacob has that dollar.
You then go to the grocery store or the convenience store, and you spend that dollar. And while they don’t take your name, they scan that dollar as it does into the cash register, and they get the serial number of that. And they know that you bought a Milky Way bar. So, somebody with the wherewithal could hack the bank, or the bank could freely give it up, or the government could demand it. “We want all the money that you gave out and all the serial numbers and who you gave them to.”
And then they could go to all the stores and say, “We want all of the money that you got in and what they bought with it.”
So, now this is a little bit of a silly example, but the fact of the matter is that I could track the fact that Jacob got a dollar at one o’clock from an ATM and then went to a store within a 10 minute time period, so it’s feasible – he didn’t go to Los Angeles from Boston and spend the money – and spent that dollar with that serial number. And I could subpoena from the store the fact that what did he buy. And he bought a Milky Way bar. Now there is a problem there in that you could claim that you gave the dollar away.
Jacob: Sure. You could do that. Yeah.
Paul: Okay. So, now thankfully, as of right now… Honestly, I don’t know. It is a very good question. Do ATMs record the serial numbers of bills that they give out to people? Why wouldn’t they? Because it’s an inventory problem, they would certainly want to know it. And it’s trivial to do that. And then, now the stores certainly don’t. You don’t see them scanning a dollar bill in that way. They just put it in the drawer.
But the fact of the matter is, if I give you that, I could have some stickiness to that information and identify who got that dollar bill. So, let’s say you got the dollar bill at 1PM and you gave it to Mr. Bad Guy at 2PM, and he went into the store and bought bullets with it and committed a crime with that. Or bought a gun, let’s say. Well, something that isn’t tracked, but… So, he brought, let’s say, lighter fluid, because he wanted to light a fire, do something bad. And he bought that.
If we track those serial numbers, it would come back to you. So, that’s the problem with tracking currency.
Anonymity in the past vs. now
Now, Bitcoin makes it easy to track. And, 25 years ago, before the internet, before Mr. Gore did the wonderful work he did for us in creating the internet, you would… Kids, when I grew up, we’d go out to play after school. We’d get home from school, and then we’d go out to play, and our parents really had no clue where we were, and they had no way to reach us. We were smart enough, we’d learned, that we needed to be home for dinner. We would be home for dinner. And usually we were within earshot of highly developed lungs of our mothers.
But nowadays… And we were completely anonymous at that point. We didn’t have an ID.
Jacob: You mean, apart from a social security.
Paul: Yeah, well sure. But you didn’t carry that around with you. So, if somebody were to see me, they might be able to identify me as a child and say, “Oh that was Paul, who was over there painting that graffiti.” But if they didn’t get a good glimpse of me, there’s really no way to know where I am.
Cell phones and normal behavior
Now let’s introduce a cell phone. Cell phone companies know exactly where we are, every single minute of every day. So… Or more so, they know where your cell phone is every single moment of every day. I’m sitting here with an iPad next to me that is on T-Mobile. T-Mobile knows where that iPad is. So, just that ripple of interjection there changes the concept of anonymity hugely. I mean, there is nothing called anonymity unless people close their eyes to the data.
So, the data gets produced and recorded in perpetuity and we can go back and correlate all that data, if we have access to it. Law enforcement can usually get a…
Jacob: They can subpoena that information.
Paul: Right. They can get legal access to it. Now there are people like me and others who have the technical ability to do it but don’t have the legal ability to do it. Because I don’t think a judge would give me a warrant to search your data.
Now, but if I were a criminal or willing to do something illegal, what’s preventing me from correlating all of that information? Well, first of all, it’s a lot of work. And it’s got to be worth it. But if it’s worth it, I could go and find out where you used your credit cards. I could go and find out.
Jacob: Create a profile of somebody from that information that’s available.
Paul: Yeah. Or just…and do all sorts of things, and really track what you’re doing. Now, you could sit there and claim, “Well, somebody else had my credit card that day and used it.” Okay. But, hopefully if we look at a wide enough view of things, that’s going to be saying… So, you haven’t had control of your credit card for the last six months.
And you look at people’s behaviors and they are very normal. There are very few outliers. You don’t usually do something that’s…
Jacob: Irrational.
Paul: Yeah, well, or just outside…an outlier, outside of your norm. I’d look at it, when I’m traveling and I’m in an airport, I go to this restaurant or this or… It would be interesting to draw that profile, that person. And somebody could look at that and say, “Hmm. Paul isn’t behaving as he normally does when he travels.”
Jacob: Right. Doesn’t this relate to… We’ve talked previously about the whole concept of Big Data.
Paul: Absolutely. Yeah, this is Big Data. I mean, Big Data is all of these little observances that can be correlated together. So, we started out asking what is anonymity. And I’ve spent a lot of time trying to figure out how could somebody become anonymous.
Browsing and IP proxies: Peeling back the onion
So, let’s imagine you want to browse the internet anonymously. Well, you have a problem in that more than likely you have internet service. And that internet service comes with an IP address. So, when you sign up for Comcast or Verizon Fios or any of the different internet providers, you get an IP address that is assigned to your modem or router. Now, that may change. But they log all of that. So, they know that on Tuesday June 19th, you had IP address 167.50.12.14.
And you can actually…I’ve been part of a subpoena where we subpoenaed those kind of things. We wanted to know… And there was legal precedent to do it. So, we said, give us all the IP addresses this house had over this period of time. And we got that list. And that was a couple of years ago. In other words… So it wasn’t like I was asking for last week’s information. I was asking for a couple of years ago information.
So, now all of a sudden, I can go to websites that you might have gone to. Let’s say it was a bad political website that seems reasonable to be concerned about. We’re sort of dismissing free speech here at the moment. So, I go there, and I look at their logs. I ask them for their logs or I hack them and get them. And I look for that IP address. And I look at the date you had it. You had it on June 19th, and I see that that IP address accessed it on this date.
So, that’s, like, the hardest way to do it. Now, what’s interesting is that Verizon, Comcast can easily keep a list of every website you connect to. So, they could see… There is nothing preventing them from keeping that information and say, “Oh, every day, at six o’clock, Jacob comes home and opens up ESPN,” because they know. So, really what they could say is somebody in Jacob’s household, using his internet, is browsing ESPN at this time.
Now, if it was how to buy illegal weapons or child pornography, that might be a thing where I can say, “Well that’s not good. We need to go and do something about that as a society and intervene.” So, then you say, “Well, how do I prevent that? How do I actually go out and not allow the organizations to see what I’m doing?”
And there’s these things called proxy servers. So, a proxy server is a server that browses on your behalf. So, you would open up a web browser and configure your computer so that all of your traffic, although it goes out this pipeline to Verizon – let’s use Verizon as the example – that information will go to Verizon and go out to the internet.
So you set up a virtual connection now that all the traffic that you have is going to go to this IP address that may be in Canada, may be in Europe, who knows where. And then it is going to go and browse on your behalf.
Jacob: So this virtual connection, this IP address, um, is that like an actual computer effectively?
Paul: Yeah. There is. There are computers that are called proxy servers.
Jacob: Oh, okay. So, it has its own operating system, and you’re basically like a hand in a puppet.
Paul:Yes. Very good example, yeah, or analogy. So you now, when you browse into ESPN.com, they see you coming from that IP address.
Jacob: I see.
Paul: Not from your Verizon IP address.
Jacob: I see.
Paul: Now, Verizon, all Verizon knows is that you are connecting to this proxy server.
Jacob: Okay.
Paul: Or to this IP address is Europe somewhere. So, you have traffic flowing there. It’s encrypted, so they can’t detect what’s in there. So it’s a literally sort of like a stream of water that is in a armored pipe that is going between your computer and this computer in Europe. So, then, so ESPN then sees you coming now as this European IP address.
Jacob: Interesting.
Paul: And that can be a problem, because they may say, “Well, we don’t serve this information to Europeans.”
Jacob: Right.
Paul: Or the other way around.
Jacob: I see that on YouTube. There will be occasionally videos that I want to watch that are recommended by a friend, go to see them, and you’re licensing doesn’t allow you to see it in the United States.
Paul: Right. A lot of people have for being able to view US Netflix, have been using proxy servers to get a, in Europe, to get a US IP address.
Jacob:Oh, yes. I’ve heard of that. Right. Yeah, I’ve heard of that, where they’ve got, I guess more selection with the US Netflix as opposed to their country.
Paul: Yeah. Downton Abbey was a good example. People in the US had to wait six months to watch Downton Abbey. If you bought Netflix in America and basically logged in via a proxy server in Europe, you would be able to see the Downton Abbey on Netflix in Europe.
Jacob: Oh, interesting.
Paul: Or whatever, you know, PBS or whatever. Or on the BBC itself. But if you browse through the BBC website and try to watch something from America, it will say, “Sorry. You’re outside of our coverage.”
Jacob: Interesting.
Paul: So you could do that now. They’ve all started to get more savvy at this, and they actually are seeing, well wait a minute. We got a lot of traffic coming from this one IP address. That probably is a proxy, because a lot of people are trying to go through it. So, in the proxy, people are fighting that. But we’re talking about how to be anonymous.
Jacob: Right.
Paul: So, a proxy server can help you be anonymous, and it is one way to do it. And there’s also this thing called the onion router, or the TOR network.
Jacob: Yeah. I’ve seen you list some articles along those lines.
Paul: And basically, what it is, is it’s a web of servers that pass your information around to multiple nodes so that it is difficult to unscramble all of that. So, rather than sending it to a proxy server, I send it to a TOR node that’s near me. It sends it to another TOR node to another one to another one to another one to another one to another one to another one, and then it finally goes out to the destination.
Jacob:Oh, interesting. So it’s not like… It’s not breaking up one point of information into six points and then to your destination. It’s just rerouting it six or seven times before it gets there.
Fingerprinting your profile
Paul: Yes. Exactly. Yeah, like onions, it’s different layers, peeling back an onion. And so, that can be helpful. The problem is that there’s things that go on… So, for example, here’s the problem. If you want ESPN to… You don’t want ESPN to know who you are, so you go and get a proxy server. And you use the proxy server and you go to ESPN, and you do that religiously.
One day, you forget to turn on your proxy server, and you go to ESPN. Well, if ESPN is smart, and they could be. I don’t know if they care. But let’s say if we were really smart and clever here, they’re not just tracking you by your IP address. They’re fingerprinting you. And fingerprinting goes down to the level of what browser you’re using, what fonts you have installed, what version of your operating system, on and on and on. And all of those attributes give you a fingerprint.
So now, I can take that fingerprint, and I can see wherever it comes in. I see it coming in off of this, this IP address in Europe. Okay. That’s our friend’s, whatever, we’ll call him George, just as an anonymous name. We don’t know who it is.
Now all of a sudden, I get a hit from you coming in with that same fingerprint, same browser, everything, uh, from a Verizon acct in Massachusetts.
Jacob: Interesting. Yeah. Yeah.
Paul: Okay. So, now. Alright, that’s no big deal. So, they’ve got one hit from Massachusetts and 100 hits from Europe.
Jacob: Right.
Paul: But now if some of those sites start to share information and Amazon, let’s say, they have that same fingerprint. And Google has that same fingerprint. And ABC.com has that same fingerprint. They can now start to say, “Hey, do you have anything for this fingerprint?” And they can correlate that.
Jacob: Interesting.
Paul: And know that gee, Google says, “Yeah, I know it’s Jacob, because he’s logged in.”
Jacob: Right.
Paul: Or it’s somebody using Jacob’s computer.
Jacob: Right.
Paul: And he’s logged in, so I know who Jacob is, or Amazon is even better, because you’ve given a credit card and you’ve had years of activity with them, and they know you from this IP address and this fingerprint. So, uh, you have all that information, and we can correlate it and say, “Well, he’s trying to be anonymous here, but because of his fingerprint, we know ESPN knows that who it is.
Jacob: Right.
Paul: Now you might not think… I mean, it’s hard to get anybody to work together, so it’s probably unlikely that Amazon and ESPN are sharing that type of information.
Jacob:Â That was my question as to whether there’s actually proof that they’re doing that. I mean, I understand it’s theoretically true.
Paul: Yeah, but it’s not a good business practice for them to do it, okay? So that’s the case. But now we have this thing called ads. And we have web bugs. And there is many of those. There’s one thing called Google Analytics, which gives Google all of your browsing habits for every website you browse that has Google Analytics on it.
Jacob: Right.
Paul: So, they sell that as a benefit to the, to the webmaster, to the person hosting the website, to say, “Hey, you can get who viewed your page, how long it was, how long they were there, etc.”
So, you can get all of this information, but you’re effectively giving Google the demographics of the people and when they use your site and why they use it.
Jacob: Right.
Paul: Now, that’s Google. There’s a company called AddThis, which is, uh, a social sharing and bookmarking applet that you can put in your webpage. So, you can add that so somebody can like it on Twitter or Facebook and tweet it and do all those if things with very little work. Well putting that bug on there gives them a flow of information. And, I knew the CTO of AddThis and I was talking to him. This was probably five or six years ago, and I said, “Are you actually profiling the pages that people are liking?” And at the point, they weren’t. But I know that they’re starting to do that. So, they do it using natural language processing to see that you browsed – you didn’t have to tweet it. You didn’t have to do anything. You just browse, because their bug was being loaded on the page, and you browsed a page on boats.
Jacob: Right.
Paul: So, now all of a sudden, they can take that and say, “Gee, Paul is browsing boat sites.” By the way, I don’t like boats. I have no use for them, being out in the ocean without…
Jacob: We’re clarifying this because you’ve used this before.
Paul: I do. Yeah. There’s no shade on the ocean. It is not a good thing.
Paul: So, they can then sell that back to boat manufacturers and say, “Hey, Paul’s interested in boats. Do you want to sell him boat ads? Do you want to show him boat ads?”
Now they don’t do it and say, “Paul,” but they say, “Let’s sell it to Amazon.” And, I’ve used the stereo example. You were browsing stereos on one site, you’ve used these plugins. You might have gone to a personal blog of somebody that talks about the best stereo for home theater for 2015 and ’16, and you read that. Add This is on the site. They know that you read an article about home theater stereos.
Magically, ads will start showing up from Amazon or Crutchfield saying, “Hey, you’re interested in home theater stereo. Is that anonymous? Well, it’s anonymous in the way they’re using it, and they would claim that, that they are using anonymized data. We don’t know who it is, that it’s Jacob or Paul browsing, we just know that this person’s fingerprint wants to see boat information.
Jacob Sure.
Paul: But you can see that it’s one click of the dial to say that that’s Jacob.
Jacob Right.
Paul: And so, if it were illegal to think about boats, uh, you could see how a totalitarian regime could use that and say, “You’re…we’re going to find you and hunt you down and change your behavior.”
Jacob: Sure. Now, forgive my ignorance on this. Let’s say, for example in a company – this was a company of 50 people or something like that – there was somebody who was using the company computer to do, you know, nefarious things or whatever. Would they be able to be subpoenaed in such a way as to find out that it was that specific person, or would the whole company be implicated by that person’s activity?
Paul: Sure. Well, the company would be implicated to the extent that they are providing the service, and they are responsible for the use of that service. Now, whether they have or do not have tracking internally, would be difficult to know. In other words, do I know what web pages you browsed before this. Well, technology could have been implemented to track that. I don’t have it in this company. So, it could be, the government could come back and say, well, I’m culpable as the company, because I should have been controlling the access to the internet if you did something illegal.
The technology, though, certainly exists to track that. Now, having said that, let’s say I didn’t and, you know, Google has great fingerprinting technology, great, comprehensive, you know, whatever you want to say, insidious, you know. The technology is amazing. Whether it’s good or not is another question. But they could come in and say, “Well, let’s have all of your 50 machines try and, uh, browse to this test page.” They could identify which machine did it.
Now, I don’t know who’s sitting at that machine when they did it, unless… You could be at your machine, you’re logged into Google, so you’re on Google. You go out to lunch. I sit down at your machine. You know, so that’s plausible.
Jacob: Right.
Secure wi-fi as an identifier
Paul: The same thing with wi-fi is if you keep your, a wi-fi secure, that presents a threshold to somebody from using it. But let’s say they break in and use it. So, they go off and they browse some illegal sites. And the government finds out that you, your internet has your internet service provider reports that you have browsed illegal sites. And you come back and you say, “Well, I didn’t do it.” So, alright. Well, who else uses the computer in the household?
So, we talk to your family, and somebody in this household did it. Well, it’s hard to then say, “Well, we have an encrypted wi-fi. And we have a password. But I’ve given it to my neighbor. So, he uses it. Or she uses it. Hmmm. It could have been them.”
If you have open wi-fi, well, it could be anybody that used it. So, it’s a very interesting thing, you know, where you would think, “Oh, I want to secure my network.”
Well, the minute you secure your network, it becomes exclusive. And it’s only going to point back to you as the one who did something. But, you could easily say, no it was my three-year-old boy. You know, throw him right under the bus. He didn’t know better.
Jacob: Yeah, sorry buddy.
Paul: And that might not be believable, or he’s not really up at three o’clock in the morning.
Jacob: He does like looking at grenade launchers.
Paul: That’s right. Exactly. He does. And most little boys do, you know, so…
So, I have pondered, “Okay, how do you become anonymous?”
Jacob: Sure.
Paul: Alright. One of the things that has been a prerequisite is to have a phone nowadays. You have to have a phone in order to receive a text message or a phone call, which is proof of who you are, a validation of who you are.
Also published on Medium.
A Case Study: The “Why” Behind the “What”
On Episode 5 of The Edge of Innovation, we look at a case study of a client and understand how to find the “why” behind the “what” in web presence. We are also talking about how to set smart goals. Many projects fail because expectations weren’t set well.
Virtualization
On Episode 4 of The Edge of Innovation, we talk about virtualization as a dynamic for business and business leaders, and the technical history and future of virtualization.
Big Data
On Episode 3 of The Edge of Innovation, we think about the actors involved in big data. What does Google do with it? They sell ads.