Skip to main content

The Shifting Scope of Assumed Privacy with LLMs

10 years ago a CEO standing on stage making a joke about someone’s web query would have been shocking instead of funny. This is because of major backlash that happened 20 years prior when AOL open sourced the query logs of 650000 random people to developers resulting in journalists using the query list to track down individuals with personally identifiable information in the logs. Those affected AOL users did not know they were in this now-public repository. But now our assumptions of scope of privacy are significantly shifted because of the era of social media that followed in the ensuing decade. The advent of micro-blogging along with vertical tools for yelping our food, foursquaring our shopping habits, tweeting our quips or instagramming our lifestyles expanded the scope of where the cameras and public visibility approached closer into our personal sphere. We came to be familiar with the concepts of privacy in a narrower scope of our private daily lives. But there is still confusion in tools like browsers where people assume their behaviors are not being broadcast in real time to multiple parties.

A few years ago a data leak happened with GPT exposing the conversations of its customers. OpenAI hadn’t made any assertions that conversations made in its interface would be private. But some users expected that typing into a question and answer dialog on the service was a personal dialog in a private space. Sam Altman, one of the founders of OpenAI and its current CEO, took care in his future product release teasers to encourage people to log out of GPT if they planned to type anything of a personal or confidential nature to dispel this assumption of privacy. With that series of disclosures we know that we can no longer assume that just because we are using an https protected query with an open window on the web that our words aren’t being blasted out to thousands of servers like a modern-day reprise of The Truman Show. The company has rolled out paid accounts that do offer privacy firewalls and sandboxing for individuals and businesses now, along with assertions that confidentiality can be protected in those paid account contexts. But advertising is about to be integrated into GPT deployments such that text you type into the window can be paired with targeting and profile cookies that enable ad serving on the GPT service or on other sites that support the GPT advertising targeting parameters, which are as of yet not announced.

If you haven’t been following the slippery slope of privacy scope degradation, please adjust where you assume your “assumption of privacy” begins for this new context. Social network activity is obviously public declarations, we can’t assume privacy there. While mail and messaging platforms may not have direct access to the messages you send, they may have the ability to leverage targeting parameters that come from AI summaries or overviews that are derived from those messages. LLM interfaces like GPT, Gemini and Claude give an illusion of one to one discussions that many mistakenly assume are as private as an SMS message, while they are not. The New York Times recently cited legal cases where defendants had disclosed details of an alleged crime assuming that GPT was a confidant which would keep the disclosure confidential under “attorney client privilege.” Judges had to clarify that GPT is not an attorney, even though it may quote like one. Smart speakers, smart cars and web-enabled cameras have all been subject to subpoena in recent years to disclose details within people's homes, which had previously been considered areas under the umbrella of assumed privacy implied by being inside a person's home. Yet the servers they broadcast to, are not. In this new phase of the AI digital age, these assumptions need to be adjusted. So we should take heed of the CEOs who warn us about assuming privacy when using their tools to discuss personal or confidential matters. 

Proprietary and confidential “data exfiltration” is a risk not just to our sense of privacy, it’s a erosion in our assumptions of trust more broadly. As the AI diaspora increases, the vulnerability to disclosure vectors will be gradually reduced by virtue of there being a broader distribution of devices that will lessen the dispersion risk of any single attack or single common point of broad vulnerability. Industry developers are rushing to bring privacy-sandboxing, RAG databases or proprietary fine-tuned LLM models to individuals' devices and small businesses. It's best at present to consider any interaction with an LLM to be as public as an interaction on a social network until these firewalls and centralized networks of vulnerability are defended. And naturally, you can assume that any app built atop a leading LLM doesn't fundamentally adhere to the confidentiality and privacy policies of the underlying LLM vendor that powers them. So disclose carefully, as we never know how many thousands or millions of people we are talking to when we talk to what we assume is one machine.

Comments

Popular posts from this blog

Far-seeing Devices for Accessibility

The German word for TV is Fernseher, meaning far-seer. I often think about that concept of the fixture of our living rooms which allows us to teleport to perspectives of other places far away. A mode of communion with others, distraction, learning. We are societally connected across the world like never before. We tend to live our lives situationally in our local communities, then at some point in our evenings we teleport our awareness into the lives of others for the snippet of time that came to be known as prime time . This slot of our societal calendars is reputed to have the broadest attention span of collective conscious focus. It came to have that moniker because of marketers seeking to have some time during the hour of evening news or entertainment that would give their messages the broadest appeal to the space-portal's "share of voice" in this communal time of focus. When terrestrial TV fragmented into multi-platform and multi-screen surface areas along with the p...

The Momentum of Openness - My Journey From Netscape User to Mozillian Contributor

(Update: Because this post is exceedingly long, I have decided to make it available as a printed book: Momentum of Openness  It will remain free to read here.) Insider story behind the cover image: Mozilla's mascot derived from the name of the Mosaic browser and the trademarked name of a large mythical beast from Japanese culture which would rise from the oceans to protect mankind against peril. You may see this mythical creature in Bugzilla, or featured in popular web browsers like Chrome when they are having issues addressing your requests. I like to call it "The Mozilla" because it serves as a protector of all that's good. When I first came to the headquarters of Mozilla, I had to get a picture being bitten by the Mozilla. You'll understand why we feel so affectionately about this symbolic icon as you read the story of my journey to web development below. Foreword Shepard Fairey's Dino Working at Mozilla has been a very educational experience over the past...

EEG hats for everyone

NeuroSky Chip Toy There are a few interesting companies developing "Brain Computer Interfaces" for toys and digital devices.  These devices read electrical fields above your scalp that indicate activity happening on the inside of your skull.  Though the devices can't capture thoughts, they can signal what regions of your brain are active at any given moment.  What this means is that the skin of your head can be used in lieu hand gestures, replacing a keyboard, mouse or joystick input. Developers should care about this because two of these companies are seeking our help, in that they are inviting us to code applications to leverage their consumer headsets.  My colleagues and I have been testing the different tools to see if they can socket into mobile apps for use in stress management or mobile gaming.  These tools are currently in use in the medical field for those who lack the ability to leverage conventional computer interfaces.  The question is whet...