jump to navigation

You’re Magnetic Tape April 4, 2019

Posted by Peter Varhol in Algorithms, Machine Learning, Technology and Culture.
Tags: , , , ,
add a comment

That line, from the Moody Blues ‘In the Beginning’ album (yes, album, from the early 1970s), makes us out to be less than the sum of our parts, rather than more.  So logically, writer and professional provocateur Felix Salmon asks if we can prove who we say we are.

Today in an era of high security, that question is more relevant than ever.  I have a current passport, a Real ID driver’s license, a Global Entry ID card, and even my original Social Security card, issued circa 1973 (not at birth, like they are today; I had to drive to obtain it).  Our devices include biometrics like fingerprints and facial recognition, and retina scans aren’t too far behind.

On the other hand, I have an acquaintance (well, at least one) that I’ve never met.  I was messaging her the other evening when I noted, “If you are really in Barcelona, it’s 2AM (thank you, Francisco Franco), and you really should be asleep.”  She responded, “Well, I can’t prove that I’m not a bot.”

Her response raises a host of issues.  First, identity is on the cusp of becoming a big business.  If I know for certain who you are, then I can validate you for all sorts of transactions, and charge a small fee for the validation.  If you look at companies like LogMeIn, that may their end game.

Second, as our connections become increasingly worldwide, do we really know if we are communicating with an actual human being?  With AI bots becoming increasingly sophisticated, they may be able to pass the Turing test.

Last, what will have higher value, our government-issued ID, or a private vendor ID?  I recently opined that I prefer the government, because they are far more disorganized than most private companies, but someone responded “Government can give you an ID one day, and arbitrarily take it away the next.”  I prefer government siloes and disorganization, because of security by obscurity, but is that really the best option any more?

So, what is our ID?  And how can we positively prove we are who we say we are?  More to the point, how can we prove that we exist?  Those questions are starting to intrude on our lives, and may become central to our existence before we realize it.


Should We Let Computers Control Aircraft? March 23, 2019

Posted by Peter Varhol in Algorithms, Software platforms.
1 comment so far

Up until the early 1990s, pilots controlled airliners directly, using hydraulic systems.  A hydraulic system contains a heavy fluid (hydraulic oil) in tubes whose pressure is used to physically push control surfaces in the desired direction.  In other words, the pilots directly manipulated the aircraft control surfaces.

There is some comfort in direct control, in that we are certain that our commands translate directly to control surface motion.  There have only been a few instances where aircraft have complete lost hydraulics.  The best-known one is United Flight 232, in 1989, where an exploding engine on the DC-10 punctured lines in all three hydraulic systems.  The airliner crash landed in Sioux City, Iowa, with the loss of about a third of the passengers and crew, yet was considered to be a successful operation.

A second was a DHL A300 cargo plane hit by a missile after takeoff from Baghdad Airport in 2003.  It managed to return to the airport without loss of life (there was only a crew of three on board), although it ended up off the runway.

In 1984, Airbus launched the A320, the first fly-by-wire airliner.  This craft used wires between the flight controls used by the pilot and the control surfaces, with computers sitting in the middle.  The computers accept a control request from the pilot, interpret it in light of all other flight data available, and decide if and how to carry out the request (note the term “request”).  There were a few incidents with early A320s, but it was generally successful.

Today, all airliners are fly-by-wire.  Cockpit controls request changes in control surfaces, and the computer decides if it is safe to carry them out.  The computers also make continuous adjustments to the control surfaces, enabling smooth flight without pilot intervention.  In practice, pilots (captain or first officer) only fly manually for perhaps a few minutes of every flight.  Even when they fly manually, they are using the fly-by-wire system, albeit with less computer intervention.  Oh, and if the computer determines that a request cannot be executed safely, it won’t.

Fly-by-wire is inarguably safer than direct-fly hydraulic systems in controlling an aircraft.  Pilots make mistakes, and a few of those mistakes can have serious consequences.  But fewer mistakes can be made if the computer is in charge.  Another but:  Anyone who says that no mistakes can be made by the computer is on drugs.

Fly-by-wire systems are controlled by complex software, and software has an inherent problem – it isn’t and can’t be perfect.  And while aircraft software is developed under strict safety protocols, that doesn’t prevent bugs.  In the 737 MAX MCAS software, Boeing seems to have forgotten that, and made the system difficult to override.  And it didn’t document the changes to pilot manuals.  And that, apparently, is why we are here.  I am not even clear that the MCAS software is buggy; instead, it seems like it performed as designed, but the design was crappy.

The real solution is that yes, the computer has to fly the airplane under most circumstances.  The aircrew in that case are flight managers, not pilots in the traditional sense.  But if there is an unusual situation (bad storm, computer or sensor failure, structural failure, or more), the pilots must be trained to take over and fly the plane safely.  That is where both airliner manufacturers and airlines are falling down right now.

Aircrews are forgetting, or not learning, how to fly planes.  And not learning situational awareness, when they are able to comprehend when something is going wrong, and need to intervene.  It’s not their fault; aircraft and flying has changed enormously in the last two decades, and there is a generation of younger pilots who may not be able to recognize a deteriorating situation, or what to do about it.

Here’s Looking At You June 18, 2018

Posted by Peter Varhol in Algorithms, Machine Learning, Software tools, Technology and Culture.
Tags: , , ,
add a comment

I studied a rudimentary form of image recognition when I was a grad student.  While I could (sometimes) identify simple images based on obviously distinguishing characteristics, the limitations of rule-based systems, the computing power of Lisp Machines and early Macs, facial recognition was well beyond the capabilities of the day.

Today, facial recognition has benefitted greatly from better algorithms and faster processing, and is available commercially by several different companies.  There is some question as to the reliability, but at this point it’s probably better than any manual approach to comparing photos.  And that seems to be a problem for some.

Recently the ACLU and nearly 70 groups sent a letter to Amazon CEO Jeff Bezos, alongside the one from 20 shareholder groups, arguing Amazon should not provide surveillance systems such as facial recognition technology to the government.  Amazon has a facial recognition system called Rekognition (why would you use a spelling that is more reminiscent of evil times in our history?)

Once again, despite the Hitleresque product name, I don’t get the outrage.  We give the likes of Facebook our life history in detail, in pictures and video, and let them sell it on the open market, but the police can’t automate the search of photos?  That makes no sense.  Facebook continues to get our explicit approval for the crass but grossly profitable commercialization of our most intimate details, while our government cannot use commercial and legal software tools?

Make no mistake; I am troubled by our surveillance state, probably more than most people, but we cannot deny tools to our government that the Bad Guys can buy and use legally.  We may not like the result, but we seem happy to go along like sheep when it’s Facebook as the shepherd.

I tried for the life of me to curse our government for its intrusion in our lives, but we don’t seem to mind it when it’s Facebook, so I just can’t get excited about the whole thing.  I cannot imagine Zuckerberg running for President.  Why should he give up the most powerful position in the world to face the checks and balances of our government?

I am far more concerned about individuals using commercial facial recognition technology to identify and harass total strangers.  Imagine an attractive young lady (I am a heterosexual male, but it’s also applicable to other combinations) walking down the street.  I take her photo with my phone, and within seconds have her name, address, and life history (quite possibly from her Facebook account).  Were I that type of person (I hope I’m not), I could use that information to make her life difficult.  While I don’t think I would, there are people who would think nothing of doing so.

So my take is that if you don’t want the government to use commercial facial recognition software, demonstrate your honesty and integrity by getting the heck off of Facebook first.

Update:  Apple will automatically share your location when you call 911.  I think I’m okay with this, too.  When you call 911 for an emergency, presumably you want to be found.

Cognitive Bias in Machine Learning June 8, 2018

Posted by Peter Varhol in Algorithms, Machine Learning.
Tags: , , ,
add a comment

I’ve danced around this topic over the last eight months or so, and now think I’ve learned enough to say something definitive.

So here is the problem.  Neural networks are sets of layered algorithms.  It might have three layers, or it might have over a hundred.  These algorithms, which can be as simple as polynomials, or as complex as partial derivatives, process incoming data and pass it up to the next level for further processing.

Where do these layers of algorithms come from?  Well, that’s a much longer story.  For the time being, let’s just say they are the secret sauce of the data scientists.

The entire goal is to produce an output that accurately models the real-life outcome.  So we run our independent variables through the layers of algorithms and compare the output to the reality.

There is a problem with this.  Given a complex enough neural network, it is entirely possible that any data set can be trained to provide an acceptable output, even if it’s not related to the problem domain.

And that’s the problem.  If any random data set will work for training, then choosing a truly representative data set can be a real challenge.  Of course, will would never use a random data set for training; we would use something that was related to the problem domain.  And here is where the potential for bias creeps in.

Bias is disproportionate weight in favor of or against one thing, person, or group compared with another.  It’s when we make one choice over another for emotional rather than logical reasons.  Of course, computers can’t show emotion, but they can reflect the biases of their data, and the biases of their designers.  So we have data scientists either working with data sets that don’t completely represent the problem domain, or making incorrect assumptions between relationships between data and results.

In fact, depending on the data, the bias can be drastic.  MIT researchers have recently demonstrated Norman, the psychopathic AI.  Norman was trained with written captions describing graphic images about death from the darkest corners of Reddit.  Norman sees only violent imagery in Rorschach inkblot cards.  And of course there was Tay, the artificial intelligence chatter bot that was originally released by Microsoft Corporation on Twitter.  After less than a day, Twitter users discovered that Tay could be trained with tweets, and trained it to be obnoxious and racist.

So the data we use to train our neural networks can make a big difference in the results.  We might pick out terrorists based on their appearance or religious affiliation, rather than any behavior or criminal record.  Or we might deny loans to people based on where they live, rather than their ability to pay.

On the one hand, biases may make machine learning systems seem more, well, human.  On the other, we want outcomes from our machine learning systems that accurately reflect the problem domain, and not biased.  We don’t want our human biases to become inherited by our computers.

Can Machines Learn Cause and Effect? June 6, 2018

Posted by Peter Varhol in Algorithms, Machine Learning.
Tags: , , ,
add a comment

Judea Pearl is one of the giants of what started as an offshoot of classical statistics, but has evolved into the machine learning area of study.  His actual contributions deal with Bayesian statistics, along with prior and conditional probabilities.

If it sounds like a mouthful, it is.  Bayes Theorem and its accompanying statistical models are at the same time surprisingly intuitive and mind-blowingly obtuse (at least to me, of course).  Bayes Theorem describes the probability of a particular outcome, based on prior knowledge of conditions that might be related to the outcome.  Further, we update that probability when we have new information, so it is dynamic.

So when Judea Pearl talks, I listen carefully.  In this interview, he is pointing out that machine learning and AI as practiced today is limited by the techniques we are using.  In particular, he claims that neural networks simply “do curve fitting,” rather than understand about relationships.  His goal is for machines to discern cause and effect between variables, that is “A causes B to happen, B causes C to happen, but C does not cause A or B”.  He thinks that Bayesian inference is ultimately a way to do this.

It’s a provocative statement to say that we can teach machines about cause and effect.  Cause and effect is a very situational concept.  Even most humans stumble over it.  For example, does more education cause people to have a higher income?  Well maybe.  Or it may be that more intelligence causes a higher income, but more intelligent people also tend to have more education.  I’m simply not sure about how we would go about training a machine, using only quantitative data, about cause and effect.

As for neural networks being mere curve-fitting, well, okay, in a way.  He is correct to point out that what we are doing with these algorithms is not finding Truth, or cause and effect, but rather looking at the best way of expressing a relationship between our data and the outcome produced (or desired, in the case of unsupervised learning).

All that says is that there is a relationship between the data and the outcome.  Is it causal?  It’s entirely possible that not even a human knows.

And it’s not at all clear to me that this is what Bayesian inference is saying.  And in fact I don’t see anything in any statistical technique that allows us to assume cause and effect.  Right now, the closest we come to this in simple correlation is R-squared, which allows us to say how much of a statistical correlation is “explained” by the data.  But “explained” doesn’t mean what you think it means.

As for teaching machines cause and effect, I don’t discount it eventually.  Human intelligence and free will is an existence proof; we exhibit those characteristics, at least some of the time, so it is not unreasonable to think that machines might someday also do so.  That said, it certainly won’t happen in my lifetime.

And about data.  We fool ourselves here too.  More on this in the next post.

Alexa, Phone Joe May 28, 2018

Posted by Peter Varhol in Algorithms, Software platforms, Technology and Culture.
Tags: , ,
add a comment

By now, the story of how Amazon Alexa recorded a private conversation and sent the recording off to a colleague is well-known.  Amazon has said that the event was a highly unlikely series of circumstances that will only happen very rarely.  Further, it promised to try to adjust the algorithms so that it didn’t happen again, but no guarantees, of course.

Forgive me if that doesn’t make me feel better.  Now, I’m not blaming Amazon, or Alexa, or the couple involved in the conversation.  What this scenario should be doing is radically readjusting what our expectations of a private conversation are.  About three decades ago, there was a short-lived (I believe) reality TV show called “Children Say the Funniest Things.”  It turned out that most of the funniest things concerned what they repeated from their parents.

Well, it’s not only our children that are in the room.  It’s also Internet-connected “smart” devices that can reliably digitally record our conversations and share them around the world.  Are we surprised?  We shouldn’t be.  Did we really think that putting a device that we could talk to in the room wouldn’t drastically change what privacy meant?

Well, here we are.  Alexa is not only a frictionless method of ordering products.  It is an unimpeachable witness listening to “some” conversations in the room.  Which ones?  Well, that’s not quite clear.  There are keywords, but depending on location, volume, and accent, Alexa may hear keywords where none are intended.

And it will decide who to share those conversations with, perhaps based on pre-programmed keywords.  Or perhaps based on an AI-type natural language interpretation of a statement.  Or, most concerning, based on a hack of the system.

One has to ask if in the very near future Alexa may well be subject to a warrant in a criminal case?  Guess what, it has already happened.  And unintended consequences will continue to occur, and many of those consequences will continue to be more and more public.

We may well accept that tradeoff – more and different unintended consequences in return for greater convenience in ordering things.  I’m aware that Alexa can do more than that, and that its range of capability will only continue to expand.  But so will the range of unintended consequences.

Google AI and the Turing Test May 12, 2018

Posted by Peter Varhol in Algorithms, Machine Learning, Software development, Technology and Culture, Uncategorized.
Tags: , , ,
add a comment

Alan Turing was a renowned mathematician in Britain, and during WW 2 worked at Bletchley Park in cryptography.  He was an early computer pioneer, and today is probably best known for the Turing Test, a way of distinguishing between computers and humans (hypothetical at the time).

More specifically, the Turing Test was designed to see if a computer could pass for a human being, and was based on having a conversation with the computer.  If the human could not distinguish between talking to a human and talking to a computer, the computer was said to have passed the Turing Test.  No computer has ever done so, although Joseph Weizenbaum’s Eliza psychology therapist in the 1960s was pretty clever (think Alfred Adler).

The Google AI passes the Turing Test.  https://www.youtube.com/watch?v=D5VN56jQMWM&feature=youtu.be.

I’m of two minds about this.  First, it is a great technical and scientific achievement.  This is a problem that for decades was thought to be intractable.  Syntax has definite structure and is relatively easy to parse.  While humans seem to understand language semantics instinctively, there are ambiguities that can only be learned through training.  That’s where deep learning through neural networks comes in.  And to respond in real time is a testament to today’s computing power.

Second, and we need this because we don’t want to have phone conversations?  Of course, the potential applications go far beyond calling to make a hair appointment.  For a computer to understand human speech and respond intelligently to the semantics of human words, it requires some significant training in human conversation.  That certainly implies deep learning, along with highly sophisticated algorithms.  It can apply to many different types of human interaction.

But no computing technology is without tradeoffs, and intelligent AI conversation is no exception.  I’m reminded of Sherry Turkle’s book Reclaiming Conversation.  It posits that people are increasingly afraid of having spontaneous conversations with one another, mostly because we cede control of the situation.  We prefer communications where we can script our responses ahead of time to conform to our expectations of ourselves.

Having our “AI assistant” conduct many of those conversations for us seems like simply one more step in our abdication as human beings, unwilling to face other human beings in unscripted communications.  Also, it is a way of reducing friction in our daily lives, something I have written about several times in the past.

Reducing friction is also a tradeoff.  It seems worthwhile to make day to day activities easier, but as we do, we also fail to grow as human beings.  I’m not sure where the balance lies here, but we should not strive single-mindedly to eliminate friction from our lives.

5/14 Update:  “Google Assistant making calls pretending to be human not only without disclosing that it’s a bot, but adding “ummm” and “aaah” to deceive the human on the other end with the room cheering it… horrifying. Silicon Valley is ethically lost, rudderless and has not learned a thing…As digital technologies become better at doing human things, the focus has to be on how to protect humans, how to delineate humans and machines, and how to create reliable signals of each—see 2016. This is straight up, deliberate deception. Not okay.” – Zeynep Tufekci, Professor & Writer 

Let’s Have a Frank Discussion About Complexity December 7, 2017

Posted by Peter Varhol in Algorithms, Machine Learning, Strategy, Uncategorized.
Tags: , , , ,
add a comment

And let’s start with the human memory.  “The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information” is one of the most highly cited papers in psychology.  The title is rhetorical, of course; there is nothing magical about the number seven.  But the paper and associated psychological studies explicitly define the ability of the human mind to process increasingly complex information.

The short answer is that the human mind is a wonderful mechanism for some types of processing.  We can very rapidly process a large amount of sensory inputs, and draw some very quick but not terribly accurate conclusions (Kahneman’s Type 1 thinking), we can’t handle an overwhelming amount of quantitative data and expect to make any sense out of it.

In discussing machine learning systems, I often say that we as humans have too much data to reliably process ourselves.  So we set (mostly artificial) boundaries that let us ignore a large amount of data, so that we can pay attention when the data clearly signify a change in the status quo.

The point is that I don’t think there is a way for humans to deal directly with a lot of complexity.  And if we employ systems to evaluate that complexity and present it in human-understandable concepts, we are necessarily losing information in the process.

This, I think, is a corollary of Joel Spolsky’s Law of Leaky Abstractions, which says that anytime you abstract away from what is really happening with hardware and software, you lose information.  In many cases, that information is fairly trivial, but in some cases, it is critically valuable.  If we miss it, it can cause a serious problem.

While Joel was describing abstraction in a technical sense, I think that his law applies beyond that.  Any time that you add layers in order to better understand a scenario, you out of necessity lose information.  We look at the Dow Jones Industrial Average as a measure of the stock market, for example, rather than minutely examine every stock traded on the New York Stock Exchange.

That’s not a bad thing.  Abstraction makes it possible for us to better comprehend the world around us.

But it also means that we are losing information.  Most times, that’s not a disaster.  Sometimes it can lead us to disastrously bad decisions.

So what is the answer?  Well, abstract, but doubt.  And verify.