jump to navigation

You’re Magnetic Tape April 4, 2019

Posted by Peter Varhol in Algorithms, Machine Learning, Technology and Culture.
Tags: , , , ,
add a comment

That line, from the Moody Blues ‘In the Beginning’ album (yes, album, from the early 1970s), makes us out to be less than the sum of our parts, rather than more.  So logically, writer and professional provocateur Felix Salmon asks if we can prove who we say we are.

Today in an era of high security, that question is more relevant than ever.  I have a current passport, a Real ID driver’s license, a Global Entry ID card, and even my original Social Security card, issued circa 1973 (not at birth, like they are today; I had to drive to obtain it).  Our devices include biometrics like fingerprints and facial recognition, and retina scans aren’t too far behind.

On the other hand, I have an acquaintance (well, at least one) that I’ve never met.  I was messaging her the other evening when I noted, “If you are really in Barcelona, it’s 2AM (thank you, Francisco Franco), and you really should be asleep.”  She responded, “Well, I can’t prove that I’m not a bot.”

Her response raises a host of issues.  First, identity is on the cusp of becoming a big business.  If I know for certain who you are, then I can validate you for all sorts of transactions, and charge a small fee for the validation.  If you look at companies like LogMeIn, that may their end game.

Second, as our connections become increasingly worldwide, do we really know if we are communicating with an actual human being?  With AI bots becoming increasingly sophisticated, they may be able to pass the Turing test.

Last, what will have higher value, our government-issued ID, or a private vendor ID?  I recently opined that I prefer the government, because they are far more disorganized than most private companies, but someone responded “Government can give you an ID one day, and arbitrarily take it away the next.”  I prefer government siloes and disorganization, because of security by obscurity, but is that really the best option any more?

So, what is our ID?  And how can we positively prove we are who we say we are?  More to the point, how can we prove that we exist?  Those questions are starting to intrude on our lives, and may become central to our existence before we realize it.

Advertisements

Will Self-Driving Cars Ever Be Truly So? January 7, 2019

Posted by Peter Varhol in Architectures, Machine Learning, Software platforms, Technology and Culture.
Tags: , , ,
comments closed

The quick answer is we will not be in self-driving cars during my lifetime.  Nor your lifetime.  Nor any combination.  Despite pronouncements by so-called pundits, entrepreneurs, reporters, and GM, there is no chance of a self-driving car being so under all conditions, let alone everyone in a self-driving car, with all that that implies.

The fact of the matter is that the Waymo CEO has come out and said that he doesn’t imagine a scenario where self-driving cars will operate under all conditions without occasional human intervention.  Ever.  “Driverless vehicles will always have constraints,” he says.  Most of his competitors now agree.

So what do we have today?  We have some high-profile demonstrations under ideal conditions, some high-profile announcements that say we are all going to be in self-driving cars within a few years.  And one completely preventable death.  That’s about it.  I will guess that we are about 70 percent of the way there, but that last 30 percent is going to be a real slog.

What are the problems?

  1. Mapping.  Today, self-driving cars operate only on routes that have been mapped in detail.  I’ll give you an example.  I was out running in my neighborhood one morning, and was stopped by someone looking for a specific street.  I realized that there was a barricaded fire road from my neighborhood leading to that street.  His GPS showed it as a through street, which was wrong (he preferred to believe his GPS rather than me).  If GPS and mapping cannot get every single street right, self-driving cars won’t work.  Period.
  2. Weather.  Rain or snow interrupts GPS signals.  As does certain terrain.  It’s unlikely that we will ever have reliable GPS, Internet, and sensor data under extreme weather condition.  Which in most of the country happens several months a year.
  3. Internet.  A highway of self-driving cars must necessarily communicate with each other.  This map (paywall) pretty much explains it all.  There are large swaths of America, especially in rural areas, that lack reliable Internet connection.
  4. AI.  Self-driving cars look toward AI to identify objects in the road.  This technology has the most potential to improve over time.  Except in bad weather.  And poorly mapped streets.

So right now we have impressive demonstrations that have no basis in reality.  I won’t discount the progress that has been made.  But we should be under no illusions that self-driving cars are right around the corner.

The good news is that we will likely see specific application in practice in a shorter period of time.  Long-haul trucking is one area that has great potential for the shorter term.  It will involve re-architecting our trucking system to create terminals around the Interstate highway system, but that seems doable, and would be a nice application of this technology.

Getting to an Era of Self-Driving Cars Will Be Messy November 30, 2018

Posted by Peter Varhol in Machine Learning, Technology and Culture.
Tags: , ,
add a comment

In the 1970s, science fiction writer Larry Niven created a near future world where instantaneous matter transport had been invented.  People would use a “phone booth” to dial in their desired destination, and automatically appear at a vacant phone booth nearest that destination.  Cargo used specially designed phone booths to transfer large or hazardous loads.

Of course, the changes in momentum attendant upon changing positions on the planet slowed the Earth’s rotation, as do jet aircraft today, and that momentum had to be dumped somewhere.  Niven used this invention as a way of exploring social phenomena, such as flash crowds (today we call them flash mobs) and ingenious ways of committing crimes.

Michael Crichton used both space and time travel in his novel Timeline (the movie was quite good too).  His technology actually copied the body at the cellular level, destroyed it at the source, then recreated it from the copy at the desired time and place.  Crichton described it by analogy, saying that it was similar to sending a fax.

The problem with this was that replication was, well, slightly less than perfect.  Cells became misaligned, which meant that cell structure was slightly off.  If you used Timeline’s time and space traveling gadget more than about half a dozen times, your body was misaligned enough so that you went crazy and/or died.

Today, we see self-driving cars as a panacea to much that ails society.  Self-driving cars are extremely safe, and they can be coordinated en masse to relieve traffic congestion.  They will obviously be electric, and not spewing combustion gasses into the atmosphere.  What could go wrong?

But none of this is remotely true, at least today and in the foreseeable future.  Although driverless cars claim an enviable safety record for miles driven, all of these miles have been on carefully mapped streets under ideal conditions.  The fact of the matter is that GPS, even with triangulation, does not give these vehicles the needed accuracy to actually travel through traffic.

Coordinated en masse?  Just what does that mean?  Even if we had cars communicating with each other on the highway, it will be 40 years before every car can do so.  And even if they were communicating, can we trust our communications systems enough to coordinate thousands of cars on a highway, feet from each other.  Can’t wait to try that one?

Electric cars.  Yes, the industry is moving in that way.  I just bought my combustion engine car; my last one was still going strong at 19 years.  Will the government force me to buy an electric car in under 20 years?  I don’t think so.

Still, this is the end game, but the end game is a lot farther out than you think.  I’m going to say a hundred years, certainly after all of us have left the mortal plane.  Car companies are saying they will be fully electric in three years.  Um, no.  Electric car advocates are even more deluded.  Car companies are saying all cars will be autonomous by 2025.  Um, no again.  These pronouncements are stupid PR statements, not worth the bytes they take up.

Yet we lap it up.  I really don’t understand that.

My Boss is a Computer August 11, 2018

Posted by Peter Varhol in Machine Learning, Technology and Culture.
Tags:
add a comment

Well, not really, but if you can be fired by a computer, it must be your boss.  Not my story, but one that foretells the future nonetheless.  An apparently uncorrectable software defect led to a contract employee being locked out of his computer and his building, and labeled inactive in the payroll.

It was almost comically funny that his manager and other senior managers and executives at the company, none of whom fired him, could not get this fiat reversed.  A full three weeks passed, in which he received no pay and no explanation, before they were able to determine that his employment status had never been updated in their new HR management software.  Even after he was reinstated, his colleagues treated him as someone not entitled to work there, and he eventually left.

It seems that intelligent (or otherwise) software is encroaching into the ultimate and unabashed people-oriented field – human resources.  And there’s not a darned thing we can do about it.  Software is not only conducting full interviews, but also performing the entire hiring process.  While we might hope that we aren’t actually selected (or rejected) by computer algorithms, that is the goal of these software systems.

So here’s the problem.  Or several problems.  First, software isn’t perfect, and while most software bugs in released software are no more than annoying, bugs in this kind of software can have drastic consequences on people.  Those consequences will likely spill over to the hiring company itself.

Second, these applications are usually machine learning systems that have had their algorithms trained through the application of large amounts of data.  The most immediate problem is that the use of biased data will simply perpetuate existing practices.  That’s a problem because everything about the interview and selection process is subjective and highly prone to bias.

Last, if the software doesn’t allow for human oversight and the ability to override, then in effect a company has ceded its hiring decisions to software that it most likely doesn’t understand.  That’s a recipe for disaster, as management has lost control over the reasons why management exists in the first place.

Now, there may be some that will say that’s actually a good thing.  Human management is, well, human, with human failings, and sometimes they manifest themselves in negative ways.  Bosses are dictatorial, or racist, or some combination of negative qualities, and are often capricious in dealing with others.  Computer software is at least consistent, if not necessarily fair as we might define it.

But no matter how poor the decisions that might come from human managers, we own them.  If it’s software, no one owns them.  When we are locked in to following the dictates of software, without any understanding as to who programmed it to do what, then we give up on our fellow citizens and colleagues.  Worse, we give up the control that we are paid to maintain.

Lest we face a dystopian future where computer software rules our working lives, and we are powerless to act as the humans we are, then we must control the software that is presumably helping us.

Here’s Looking At You June 18, 2018

Posted by Peter Varhol in Algorithms, Machine Learning, Software tools, Technology and Culture.
Tags: , , ,
add a comment

I studied a rudimentary form of image recognition when I was a grad student.  While I could (sometimes) identify simple images based on obviously distinguishing characteristics, the limitations of rule-based systems, the computing power of Lisp Machines and early Macs, facial recognition was well beyond the capabilities of the day.

Today, facial recognition has benefitted greatly from better algorithms and faster processing, and is available commercially by several different companies.  There is some question as to the reliability, but at this point it’s probably better than any manual approach to comparing photos.  And that seems to be a problem for some.

Recently the ACLU and nearly 70 groups sent a letter to Amazon CEO Jeff Bezos, alongside the one from 20 shareholder groups, arguing Amazon should not provide surveillance systems such as facial recognition technology to the government.  Amazon has a facial recognition system called Rekognition (why would you use a spelling that is more reminiscent of evil times in our history?)

Once again, despite the Hitleresque product name, I don’t get the outrage.  We give the likes of Facebook our life history in detail, in pictures and video, and let them sell it on the open market, but the police can’t automate the search of photos?  That makes no sense.  Facebook continues to get our explicit approval for the crass but grossly profitable commercialization of our most intimate details, while our government cannot use commercial and legal software tools?

Make no mistake; I am troubled by our surveillance state, probably more than most people, but we cannot deny tools to our government that the Bad Guys can buy and use legally.  We may not like the result, but we seem happy to go along like sheep when it’s Facebook as the shepherd.

I tried for the life of me to curse our government for its intrusion in our lives, but we don’t seem to mind it when it’s Facebook, so I just can’t get excited about the whole thing.  I cannot imagine Zuckerberg running for President.  Why should he give up the most powerful position in the world to face the checks and balances of our government?

I am far more concerned about individuals using commercial facial recognition technology to identify and harass total strangers.  Imagine an attractive young lady (I am a heterosexual male, but it’s also applicable to other combinations) walking down the street.  I take her photo with my phone, and within seconds have her name, address, and life history (quite possibly from her Facebook account).  Were I that type of person (I hope I’m not), I could use that information to make her life difficult.  While I don’t think I would, there are people who would think nothing of doing so.

So my take is that if you don’t want the government to use commercial facial recognition software, demonstrate your honesty and integrity by getting the heck off of Facebook first.

Update:  Apple will automatically share your location when you call 911.  I think I’m okay with this, too.  When you call 911 for an emergency, presumably you want to be found.

Cognitive Bias in Machine Learning June 8, 2018

Posted by Peter Varhol in Algorithms, Machine Learning.
Tags: , , ,
add a comment

I’ve danced around this topic over the last eight months or so, and now think I’ve learned enough to say something definitive.

So here is the problem.  Neural networks are sets of layered algorithms.  It might have three layers, or it might have over a hundred.  These algorithms, which can be as simple as polynomials, or as complex as partial derivatives, process incoming data and pass it up to the next level for further processing.

Where do these layers of algorithms come from?  Well, that’s a much longer story.  For the time being, let’s just say they are the secret sauce of the data scientists.

The entire goal is to produce an output that accurately models the real-life outcome.  So we run our independent variables through the layers of algorithms and compare the output to the reality.

There is a problem with this.  Given a complex enough neural network, it is entirely possible that any data set can be trained to provide an acceptable output, even if it’s not related to the problem domain.

And that’s the problem.  If any random data set will work for training, then choosing a truly representative data set can be a real challenge.  Of course, will would never use a random data set for training; we would use something that was related to the problem domain.  And here is where the potential for bias creeps in.

Bias is disproportionate weight in favor of or against one thing, person, or group compared with another.  It’s when we make one choice over another for emotional rather than logical reasons.  Of course, computers can’t show emotion, but they can reflect the biases of their data, and the biases of their designers.  So we have data scientists either working with data sets that don’t completely represent the problem domain, or making incorrect assumptions between relationships between data and results.

In fact, depending on the data, the bias can be drastic.  MIT researchers have recently demonstrated Norman, the psychopathic AI.  Norman was trained with written captions describing graphic images about death from the darkest corners of Reddit.  Norman sees only violent imagery in Rorschach inkblot cards.  And of course there was Tay, the artificial intelligence chatter bot that was originally released by Microsoft Corporation on Twitter.  After less than a day, Twitter users discovered that Tay could be trained with tweets, and trained it to be obnoxious and racist.

So the data we use to train our neural networks can make a big difference in the results.  We might pick out terrorists based on their appearance or religious affiliation, rather than any behavior or criminal record.  Or we might deny loans to people based on where they live, rather than their ability to pay.

On the one hand, biases may make machine learning systems seem more, well, human.  On the other, we want outcomes from our machine learning systems that accurately reflect the problem domain, and not biased.  We don’t want our human biases to become inherited by our computers.

Can Machines Learn Cause and Effect? June 6, 2018

Posted by Peter Varhol in Algorithms, Machine Learning.
Tags: , , ,
add a comment

Judea Pearl is one of the giants of what started as an offshoot of classical statistics, but has evolved into the machine learning area of study.  His actual contributions deal with Bayesian statistics, along with prior and conditional probabilities.

If it sounds like a mouthful, it is.  Bayes Theorem and its accompanying statistical models are at the same time surprisingly intuitive and mind-blowingly obtuse (at least to me, of course).  Bayes Theorem describes the probability of a particular outcome, based on prior knowledge of conditions that might be related to the outcome.  Further, we update that probability when we have new information, so it is dynamic.

So when Judea Pearl talks, I listen carefully.  In this interview, he is pointing out that machine learning and AI as practiced today is limited by the techniques we are using.  In particular, he claims that neural networks simply “do curve fitting,” rather than understand about relationships.  His goal is for machines to discern cause and effect between variables, that is “A causes B to happen, B causes C to happen, but C does not cause A or B”.  He thinks that Bayesian inference is ultimately a way to do this.

It’s a provocative statement to say that we can teach machines about cause and effect.  Cause and effect is a very situational concept.  Even most humans stumble over it.  For example, does more education cause people to have a higher income?  Well maybe.  Or it may be that more intelligence causes a higher income, but more intelligent people also tend to have more education.  I’m simply not sure about how we would go about training a machine, using only quantitative data, about cause and effect.

As for neural networks being mere curve-fitting, well, okay, in a way.  He is correct to point out that what we are doing with these algorithms is not finding Truth, or cause and effect, but rather looking at the best way of expressing a relationship between our data and the outcome produced (or desired, in the case of unsupervised learning).

All that says is that there is a relationship between the data and the outcome.  Is it causal?  It’s entirely possible that not even a human knows.

And it’s not at all clear to me that this is what Bayesian inference is saying.  And in fact I don’t see anything in any statistical technique that allows us to assume cause and effect.  Right now, the closest we come to this in simple correlation is R-squared, which allows us to say how much of a statistical correlation is “explained” by the data.  But “explained” doesn’t mean what you think it means.

As for teaching machines cause and effect, I don’t discount it eventually.  Human intelligence and free will is an existence proof; we exhibit those characteristics, at least some of the time, so it is not unreasonable to think that machines might someday also do so.  That said, it certainly won’t happen in my lifetime.

And about data.  We fool ourselves here too.  More on this in the next post.

More on AI and the Turing Test May 20, 2018

Posted by Peter Varhol in Architectures, Machine Learning, Strategy, Uncategorized.
Tags: , , ,
add a comment

It turns out that most people who care to comment are, to use the common phrase, creeped out at the thought of not knowing whether they are talking to an AI or a human being.  I get that, although I don’t think I’m myself bothered by such a notion.  After all, what do we know about people during a casual phone conversation?  Many of them probably sound like robots to us anyway.

And this article in the New York Times notes that Google was only able to accomplish this feat by severely limiting the domain in which the AI could interact with – in this case, making dinner reservations or a hair appointment.  The demonstration was still significant, but isn’t a truly practical application, even within a limited domain space.

Well, that’s true.  The era of an AI program interacting like a human across multiple domains is far away, even with the advances we’ve seen over the last few years.  And this is why I even doubt the viability of self-driving cars anytime soon.  The problem domains encountered by cars are enormously complex, far more so than any current tests have attempted.  From road surface to traffic situation to weather to individual preferences, today’s self-driving cars can’t deal with being in the wild.

You may retort that all of these conditions are objective and highly quantifiable, making it possible to anticipate and program for.  But we come across driving situations almost daily that have new elements that must be instinctively integrated into our body of knowledge and acted upon.  Computers certainly have the speed to do so, but they lack a good learning framework to identify critical data and integrate that data into their neural network to respond in real time.

Author Gary Marcus notes that what this means is that the deep learning approach to AI has failed.  I laughed when I came to the solution proposed by Dr. Marcus – that we return to the backward-chaining rules-based approach of two decades ago.  This was what I learned during much of my graduate studies, and was largely given up on in the 1990s as unworkable.  Building layer upon layer of interacting rules was tedious and error-prone, and it required an exacting understanding of just how backward chaining worked.

Ultimately, I think that the next generation of AI will incorporate both types of approaches.  The neural network to process data and come to a decision, and a rules-based system to provide the learning foundation and structure.