I didn’t even think of this but it does raise an interesting point about commercial real estate. Voice-driven computing is impossible in an open-concept environment unless everyone is wearing mics (and probably even then).
I didn’t even think of this but it does raise an interesting point about commercial real estate. Voice-driven computing is impossible in an open-concept environment unless everyone is wearing mics (and probably even then).
“Voice as first-class interface” also implies “the device is always listening. I *strongly* doubt that any device is trustworthy enough for that. It’s one of those “sounds futuristic and cool” ideas that you realize should remain “science fiction” for the foreseeable future once you think about it.
Not really, tbh. “Always listening” means different things. Being able to listen for a hot word with an ML kernel that fits on a gumstick, in order to wake up and parse fully, is different from listening and parsing for meaning at all times. Or just do what I do and hold down cmd+option.
The pitch in the press releases a few weeks back was very much "a voice version of the Windows Recall feature everyone got mad about earlier this year", so this one boils down to "there's a reasonable way to do this and then there's what the marketers are saying".
I’m okay with “hold down a key combo” or some other form of positive confirmation that you are speaking to a device, though I prefer a direct hardware control. The problem with hot words is that the device is always listening for it, so you have to trust that the device isn’t recording everything.
This is a big problem in places where you need security guarantees that conversations aren’t being recorded, such as places that need to be HIPAA-compliant or a lawyer’s office (and lawyers using AI to write briefs doesn’t exactly fill me with confidence that they are aware of the dangers).
Oh, yeah, definitely agreed. Academic research is also full of these - any Zoom call could be an interview or a research chat with confidentiality concerns, etc. This very much sounds like a tech someone pitched based on watching too much Star Trek without thinking about how it looks in context.
Voice interaction definitely has use cases, but it has probably just as many "uh, you should not do that here" cases.
Ehh. You have to trust that your device isn’t exfiltrating everything already. Like, unless you are a burn-it-down Linux user with a hardware cutoff for your AV, you’re kinda just arguing over degree. (I have hardware mutes for my actual mines, but there are also array mics on the laptop, so, eh)
(I used to run the mobile team for a medical transcription startup, so I have spent time thinking about this, and the threat model is intractable for *owning a phone* when you go up the ladder)
I’ve been in secure environments that required leaving all electronics outside the door. So yes - my exact problem is how much you can really trust *any* device. I have a few Echo devices, but I also have the privilege of knowing how the devices were designed and how the backend works.
I think we're going to rapidly come to a solution akin to what webcams came to. They now have a hardware shutter, and I think we're going to get mics with hardware physical cut offs. That's what the people will start wanting, and the hardware companies have no incentive not to supply it.
There's some Linux-first platforms that already have these, see: puri.sm/learn/hardwa... (I have not used any of their products and do not know of their quality)
On Mac, I used Micro Snitch so I could at least see when a recording device was in use in the absence of true hardware controls. It really bothers me that the industry-wide rush to remove buttons and knobs resulted in worse security. I’m glad at least some platforms are figuring that out.
It does make me a little paranoid that their website is super gung-ho on libertarian American terms though lol
Yeah, well - if there’s something I would want from a libertarian, it’s a focus on privacy and security. After all, I don’t have to agree with their political principles to use their software. 🤪
Pretty much, but if it’s buying hardware, the question of where the money goes does come up
I can see the software companies trying to create incentives, have some sort of requirement that the switch is software, but I don't think that would go over very well.
Sure, but Microsoft is bad at it. Siri and Alexa have done this the correct way for years. Whisper models make it a lot easier to do on your own computer. (I use Hex probably 30-40 times a day to transcribe something because I can talk twice as fast as I can type, but I also talk the way I write.)
Recording locally to a dump buffer shouldn’t bother most people (except maybe a lawyer); if that’s compromised, several other more important things are toast. But the all-record stuff is self-defeating and dumb out of Microsoft, so pretty on-brand.
I am also an outlier in that even though i speak very quickly when left to my own devices I can still type even faster than I talk, so, this stuff is all "no, no thank you, never" for me lol.
I mean, I type around 100 wpm, but I also want my wrists to work in twenty years
I hover between 110-130 depending on the test and whether I slept properly that day. 10 years ago I could hit 150 but also I was 24 lol.
(I used to type faster, but it stopped being useful)
Yeah, I don’t have a problem with voice recognition or even LLMs as concepts. I just think they should be implemented and used responsibly, not as the latest cool gadget that’s supposed to completely replace human effort.
When I was still at Bungie they did a huge office remodel and decided nobody would have fixed desks, just made the office a giant bullpen. I said this is good because humans famously love this, and also you’re having fucking Legal and HR sit out in public? We have confidentiality obligations.
I went through this as a public-facing incident responder at MSFT. I was working on security incidents that, if disclosed, would have impacted the customer's stock price and, while there were offices available, I was prohibited from having one because I didn't have direct reports.
I'm facepalming so hard it's detectable on the Richter scale. (Monolith had a mostly open-office plan and when WBD tried to force us back in office, most of the producers were all sitting next to each other... that really was a special hell trying to do Zoom meetings for the rest of us.)
Because Bungie was run by people who aren’t smart, the answer to this was “okay, we’ll declare a portion of the office to be the Legal, HR, and Finance (sure, I guess?) neighborhood”.
This is aggravatingly dumb
I'm being dragged back to the office next week and this is the *exact* system they've set up there, right down to calling them neighbourhoods. But the neighbourhoods aren't enforced in the booking software so I'm tempted to go and sit with the board of directors for a bit.
I have no idea how this worked out, I only ever went to the office twice while I was still there, because I could guarantee my own home was secure but not a fucking “neighborhood”.
There's the rub. People pushing for voice controlled computing probably only work from home, in hotel rooms or in the corner office
I feel like we can safely assume "poorly"
As most businesses fads do, not that the failure ever teaches any lessons.
I'm not sure we can call open offices a "fad" after nearly a century of bullpens and open plan offices existing
Oh, it’s a fad - a serial one that companies try every so often because nobody seems to remember that it almost never works out. They do it because they see a “successful startup” do it without realizing that it’s not what made the success. It’s basically business cargo culture.
I'm not sure we can call it a fad when it dominates most industries and is the terminal Entry Level Workspace solution for new hires and juniors, though. It doesn't go away.
When I say “serial fad,” I mean that it’s something that gets tried because an executive heard (or experienced) how it helped some other company but then fails and gets rolled back until the next executive tries it. This is opposed to something like stack ranking, which is just endemic bad practice.
It’s sort of like how the size of QA orgs are inversely correlated with the business cycle. A common serial fad is downsizing QA to “minimize costs,” which lasts until quality issues force an executive or board member out, and then build QA back up again and start the cycle anew.
Probably installed a bunch of meeting rooms and phone booths and they were at ~100% occupancy all the time, making impromptu confidentiality impossible, etc.
We made a conference table out of counterfeit merch using my team’s share of the “office morale” budget and HR sat at that table when they fired my team. Or so I’m told. I didn’t get the courtesy of a call on the day they got rid of me. I found out when there was a meeting at 1pm I couldn’t access.
Well that was classy. Jesus
If ever anyone wonders why I seem angry at Bungie and/or Pete Parsons, there’s a big part of why.
If it’s not clear from context: the meeting was to tell people “hey, we did layoffs, a bunch of your colleagues are gone, please feel sorry for us”. Yes, it’s hard to be the person who ends someone’s job. I’ve had to do it in performance-based terms. It sucks. Sucks worse for the other guy though.
The expression of sympathy for this being hard would ordinarily go from me to the HR person who did the term. The HR person should NEVER ask for sympathy from the termed person’s ex-colleagues. That goes 100x for the CEO.
Fortunately he was able to console himself by buying a car that afternoon.
I worked at a place where a new CTO tried to do this to us. The proposed solution to the problem of dev kits was "equipment carts." Another exec who understood the business stepped in on our behalf to stop this nonsense, thankfully.
I later worked at another place that did this to itself. I was fully remote and still it was incredibly distracting to be in a *zoom call* with people in that environment.
As someone who works in an open concept office: if two people have a Zoom call at once, no one else can hear themselves, sooo
We had zoom meetings with our team all in our cubes (or WFH) and even with 6' partitions, we were sometimes getting mic feedback.
It's extra hilarious knowing that a year before Covid Microsoft bulldozed like a third of its campus of old single-person office buildings to rebuild with open plan buildings that everyone hates. But maybe with all the layoffs they're sitting 3 people to a floor or something
Is this like I have to yell at my car: “Hey. …Hey!…HEY!!! YOU EFFIN MACHINE, wake up!” before I can tell it the doctor’s name where I want to go and it gives me a list of 16 different possibilities most of them 150 miles away? And then I say, No…No…No…NO…(screaming) NO!!! That is so charming.