jump to navigation

Tragically, Once Again Self-Driving Cars Aren’t August 28, 2021

Posted by Peter Varhol in Machine Learning, Software development, Technology and Culture, travel.
Tags:
add a comment

Two stories crossed my attention today that once again, tragically demonstrate that we are likely decades away from self-driving cars, if at all.  The first, and stupidest, was with the largest and most arrogant auto company, Toyota, which for some inconceivable reason decided to test-drive its autonomous buses at the Paralympics.  One hit an athlete in a legal crosswalk, injuring him and knocking him out of the Games.

Toyota’s CEO posted an apology on YouTube (not even referring to the athlete’s name, which is simply insulting), which is not so much an apology as a brazen PR stunt.  I know people who swear by Toyota cars; I swear at them, and this level of arrogance makes it worse.  Make it right with the athlete, which Toyota will not do, lest they damage their brand.

The second is, of course, a Tesla, which advertises a “fully autonomous mode” which is anything but.  A driver who admits he was not paying attention, instead watching a movie, hit two police cars with lights flashing on the side of the road, attending to another motorist.

Of course, despite the marketing names Tesla gives its driver-assist technology (and that’s really what it is), they have plenty of caveats in the fine print.  Those caveats are to keep it out of legal trouble even though the marketing names strongly suggest otherwise.  This was the eleventh police car displaying flashing lights that Teslas have hit.  While Tesla may end up being a long term success, it is doing itself no favors in the interim.

So what happened to all of the predictions?  This is how Anthony Foxx, former U.S. secretary of transportation, envisioned the future of autonomous vehicles in 2016:

“By 2021, we will see autonomous vehicles in operation across the country in ways that we [only] imagine today. … Families will be able to walk out of their homes and call a vehicle, and that vehicle will take them to work or to school. We’re going to see transit systems sharing services with some of these companies.”

Auto executives were no less effusive.  Elon Musk is by far the worst of the group.  I strongly believe that these so-called predictions were/are criminally wrong, because they encourage people to misuse today’s technology.

I personally believe that fully autonomous vehicles are at least decades away, and possibly completely infeasible.

Why Testing Needs Explainable Artificial Intelligence April 19, 2021

Posted by Peter Varhol in Algorithms, Machine Learning, Software development.
Tags: , , , , ,
add a comment

Many artificial intelligence/machine learning (AI/ML) applications produce results that are not easily understandable from their training and input data.  This is because these systems are largely black boxes that use multiple algorithms (sometimes hundreds) to process data and return a result.  Tracing how this data is processed, in mathematical algorithms, is an impossible task for a person.

Further, these algorithms were “trained” or adjusted based on the data used as the foundation of learning.  What is really happening there is that the data is adjusting algorithms to reflect what we already know about the relationship between inputs and outputs.  In other words, we are doing a very complex type of nonlinear regression, without any inherent knowledge of a casual relationship between inputs and outputs.

At worst, the outputs from AI systems can sometimes seem nonsensical, based on what is known about the problem domain.  Yet because those outputs come from software, we are inclined to trust them and apply them without question.  Maybe we shouldn’t.

But it can be more subtle than that.  The results could pose a systemic bias that made outputs seem correct, or at least plausible, but are not, or at least not ethically right.  And users rarely have recourse to question the outputs, making them a black box.

This is where explainable AI (XAI) comes in.  In cases where the relationship between inputs and outputs is complex and not especially apparent, users need the application to explain why it delivered a certain output.  It’s a matter of trusting the software to do what we think it is doing.  Ethical AI also plays into this concept.

So how does XAI work?  There is a long way to go here, but there are a couple of techniques that show some promise.  It operates off of the principles of transparency, interpretability, and explainability.  Transparency means that we need to be able to look into the algorithms to clearly discern how they are processing input data.  While that may not tell us how those algorithms are trained, it provides insight into the path to the results, and is intended for interpretation by the design and development team.

Interpretability is how the results might be presented for human understanding.  In other words, if you have an application and are getting a particular result, you should be able to see and understand how that result was achieved, based on the input data and processing algorithms.  There should be a logical pathway between data inputs and result outputs.

Explainability remains a vague concept while researchers try to define exactly how it might work.  We might want to support queries into our results, or to get detailed explanations into more specific phases of the processing.  But until there is better consensus, this feature remains a gray area.

The latter two characteristics are more important to testers and users.  How you do this depends on the application.  Facial recognition software can usually be built to describe facial characteristics and how they match up to values in an identification database.  It becomes possible to build at least interpretability into the software.

But interpretability and explainability are not as easy when the problem domain is more ambiguous.  How can we interpret an e-commerce recommendation that may or may not have anything to do with our product purchase?  I have received recommendations on Amazon that clearly bear little relationship to what I have purchased or examined, so we don’t always have a good path between source and destination.

So how do we implement and test XAI? 

Where Testing Gets Involved

Testing AI applications tends to be very different than testing traditional software.  Testers often don’t know what the right answer is supposed to be.  XAI can be very helpful in that regard, but it’s not the complete answer.

Here’s where XAI can help.  If the application is developed and trained in a way where algorithms show their steps in coming from problem to solution, then we have something that is testable.

Rule-based systems can make it easier, because the rules form a big part of the knowledge.  In neural networks, however, the algorithms rule, and they bear little relationship to the underlying intelligence.  But rule-based intelligence is much less common today, so we have to go back to the data and algorithms.

Testers often don’t have control over how AI systems work to create results.  But they can delve deeply into both data and algorithms to come up with ways to understand and test the quality of systems.  It should not be a black box to testers or to users.  How do we make it otherwise?

Years ago, I wrote a couple of neural network AI applications that simply adjusted the algorithms in response to training, without any insight on how that happened.  While this may work in cases where the connection isn’t important, knowing how our algorithms contribute to our results has become vital.

Sometimes AI applications “cheat”, using cues that do not accurately reflect the knowledge within the problem domain.  For example, it may be possible to facially recognize people, not through their characteristics, but through their surroundings.  You may have data to indicate that I live in Boston, and use the Boston Garden in the background as your cue, rather than my own face.  That may be accurate (or may not be), but it’s not facial recognition.

A tester can use an XAI application here to help tell the difference.  That’s why developers need to build in this technology.  But testers need deep insight into both the data and the algorithms.

Overall, a human in the loop remains critical.  Unless someone is looking critically at the results, then they can be wrong, and quality will suffer.

There’s no one correct answer here.  Instead, testers need to be intimately involved in the development of AI applications, and insist on explanatory architecture.  Without that, there is no way of comprehending the quality that these applications need to deliver actionable results.

Should Testers Learn to Code?  The Definitive Answer September 23, 2020

Posted by Peter Varhol in Software development, Strategy.
Tags: , ,
1 comment so far

I came across this fundamental question yet again today, and am long since weary of reading answers.  Those who ask the question are predisposed to a particular answer (almost always in the affirmative).  Tired of the mountains of answers that are there to be climbed, I decided to cogitate on the definitive answer for all time, to bury this question in the sands of time.

Before my answer sends gravity ripples across Known Space, let me say that I like to take contrarian viewpoints when possible.  A few years ago, my friends at TestRail published a blog post on the topic, and I responded with my own post entitled “Should Coders Learn to Test?”  Regrettably, my levity was not well received.

The answer to the fundamental question, however, is yes.  It’s clear that in a world without constraints, testers should learn to code.  More knowledge is always better than less, even if the value of that knowledge is indeterminate at the time it is acquired.

But we live in a world of constraints, whether it be time, other alternatives, inclination, aptitude, or other.  These constraints are almost always a deciding factor on the actions we take in navigating our professional lives.  How we respond to those constraints defines the directions we take at various points in life.

Is it possible that not learning to code can have detrimental effects on testers in both the short and long term.  If team and management expectations are that they provide the types of error detection and analysis that assumes an in-depth knowledge of the code, then does not have the skills or inclination to do so penalize their standing with the team and their career prospects?

But it is just as possible that other knowledge could be just as effective in project and career success.  Testers may be experts at the domain, and be able to offer invaluable advice on how the software will really be used; they may write the best documentation; or have the best problem-solving skills.  Yet culturally we denigrate them because they can’t code?

I’ll explore that thought later in more detail, but it occurs to me that we as software professionals are more or less stuck in an Agile-ish way of thinking about our projects.  We bandy about terms like Scrum, sprints, product owner, Jira, and retrospective as if they magically convey a skill and efficiency on the team that was not there in the past.  And we truly believe that Agile team members don’t specialize; we are all just team members, which enables us to do any task required by the project.  I would like to question that assumption.

I casually follow American football during the season, and listen to the talking heads praise coaches (Scrum masters???) such as the New England Patriots’ Bill Belichick for adapting to the strengths of his individual players, and designing offensive and defensive strategies that take advantage of those strengths.  In contrast, many other coaches seem to fixate on their preferred “systems”, remaining in their comfort zones and forcing the players to adapt to their preferences.  In many cases, those coaches don’t seem to last very long.

Continuing that analogy, can we adapt our project teams to take advantage of the strengths of individual team members, rather than always force them into an Agile methodology?  Perhaps we need to study more about team structure and interpersonal dynamics rather than how to properly formulate and carry out Scrum.  Let’s use our people’s strengths, rather than simply use our people.

<To be continued>

Automation Can Be Dangerous December 6, 2018

Posted by Peter Varhol in Software development, Software tools, Strategy, Uncategorized.
Tags: , ,
add a comment

Boeing has a great way to prevent aerodynamic stalls in their 737 MAX aircraft.  A set of sensors determines through airspeed and angle of attack that an aircraft is about to stall (that is, lose lift on its wings), and automatically pitch the nose down to recover.

Apparently malfunctioning sensors on Lion Air Flight 610 caused the aircraft nose to sharply pitch down absent any indication of a stall.  Preliminary analysis indicates that the pilots were unable to overcome the nose-down attitude, and the aircraft dove into the sea.  Boeing’s solution to this automation fault was explicit, even if its documentation wasn’t.  Turn off the system.

And this is what the software developers, testers, and their bosses don’t get.  Everyone thinks that automation is the silver bullet.  Automation is inherently superior to manual testing.  Automation will speed up testing, reduce costs, and increase quality.  We must have more automation engineers, and everyone not an automation engineer should just go away now.

There are many lessons here for software teams.  Automation is great when consistency in operation is required.  Automation will execute exactly the same steps until the cows come home.  That’s a great feature to have.

But many testing activities are not at all about consistency in operation.  In fact, relatively few are.  It would be good for smoke tests and regression tests to be consistent.  Synthetic testing in production also benefits from automation and consistency.

Other types of testing?  Not so much.  The purpose of regression testing, smoke testing, and testing in production is to validate the integrity of the application, and to make sure nothing bad is currently happening.  Those are valid goals, but they are only the start of testing.

Instead, testing is really about individual users and how they interact with an application.  Every person does things on a computer just a little different, so it behooves testers to do the same.  This isn’t harkening back to the days of weeks or months of testing, but rather acknowledging that the purpose of testing is to ensure an application is fit for use.  Human use.

And sometimes, whether through fault or misuse, automation breaks down, as in the case of the Lion Air 737.  And teams need to know what to do when that happens.

Now, when you are deploying software perhaps multiple times a day, it seems like it can take forever to sit down and actually use the product.  But remember the thousands more who are depending on the software and the efforts that go behind it.

In addition to knowing when and how to use automation in software testing, we also need to know when to shut it off, and use our own analytical skills to solve a problem.  Instead, all too often we shut down our own analytical skills in favor of automation.

I Don’t Need a Hero October 23, 2018

Posted by Peter Varhol in Software development, Software platforms, Strategy.
Tags: , ,
add a comment

Apologies to Bonnie Tyler, but we don’t need heroes, as we have defined them in our culture.  “He’s got to be strong, he’s got to be fast, and he’s got to be fresh from the fight.”  Um, no.

Atul Gawande, author of The Checklist Manifesto, makes it clear that the heroes, those in any profession that create a successful outcome primarily on the strength of their superhuman effort, don’t deserve to be recognized as true heroes.  In fact, we should try to avoid circumstances that appear to require a superhuman effort.

So what are heroes?  We would like to believe that they exist.  Myself, I am enamored with the astronauts of a bygone era, who faced significant uncertainties in pushing the envelope of technology, and accepted that their lives were perpetually in hock.  But, of course, they were the same ones who thought that they were better than those who sacrificed their lives, because they survived.

Today, according to Gawande, the heroes are those who can follow checklists in order to make sure that they don’t forget any step in a complex process.  The checklists themselves can be simple, in that they exist to prompt professionals to remember and execute seemingly simple steps that are often forgotten in the heat of crisis.

In short, Gawande believes in commercial airline pilots, such as Chesley (Sully) Sullenberger, who with his copilot Jeffrey Skiles glided their wounded plane to a ditching in the Hudson River off Midtown Manhattan.  Despite the fact that we all know Sully’s name in the Miracle on the Hudson, it was a team effort by the entire flight crew.  And they were always calm, and in control.

Today, software teams are made up on individuals, not close team members.  Because they rarely work as a team, it’s easy for one or more individuals to step up and fix a problem, without the help of the team.

There are several problems with that approach, however.  First, if an extra effort by one person is successful, the team may not try as hard in the future, knowing that they will be bailed out of difficult situations.  Second, the hero is not replicable; you can’t count on it again and again in those situations.  Third, the hero can’t solve every problem; other members of the team will eventually be needed.

It feels good to be the hero, the one who by virtue of extreme effort fixes a bad situation.  The world loves you.  You feel like you’ve accomplished something significant.  But you’re not at all a hero if your team wasn’t there for you.

Google AI and the Turing Test May 12, 2018

Posted by Peter Varhol in Algorithms, Machine Learning, Software development, Technology and Culture, Uncategorized.
Tags: , , ,
add a comment

Alan Turing was a renowned mathematician in Britain, and during WW 2 worked at Bletchley Park in cryptography.  He was an early computer pioneer, and today is probably best known for the Turing Test, a way of distinguishing between computers and humans (hypothetical at the time).

More specifically, the Turing Test was designed to see if a computer could pass for a human being, and was based on having a conversation with the computer.  If the human could not distinguish between talking to a human and talking to a computer, the computer was said to have passed the Turing Test.  No computer has ever done so, although Joseph Weizenbaum’s Eliza psychology therapist in the 1960s was pretty clever (think Alfred Adler).

The Google AI passes the Turing Test.  https://www.youtube.com/watch?v=D5VN56jQMWM&feature=youtu.be.

I’m of two minds about this.  First, it is a great technical and scientific achievement.  This is a problem that for decades was thought to be intractable.  Syntax has definite structure and is relatively easy to parse.  While humans seem to understand language semantics instinctively, there are ambiguities that can only be learned through training.  That’s where deep learning through neural networks comes in.  And to respond in real time is a testament to today’s computing power.

Second, and we need this because we don’t want to have phone conversations?  Of course, the potential applications go far beyond calling to make a hair appointment.  For a computer to understand human speech and respond intelligently to the semantics of human words, it requires some significant training in human conversation.  That certainly implies deep learning, along with highly sophisticated algorithms.  It can apply to many different types of human interaction.

But no computing technology is without tradeoffs, and intelligent AI conversation is no exception.  I’m reminded of Sherry Turkle’s book Reclaiming Conversation.  It posits that people are increasingly afraid of having spontaneous conversations with one another, mostly because we cede control of the situation.  We prefer communications where we can script our responses ahead of time to conform to our expectations of ourselves.

Having our “AI assistant” conduct many of those conversations for us seems like simply one more step in our abdication as human beings, unwilling to face other human beings in unscripted communications.  Also, it is a way of reducing friction in our daily lives, something I have written about several times in the past.

Reducing friction is also a tradeoff.  It seems worthwhile to make day to day activities easier, but as we do, we also fail to grow as human beings.  I’m not sure where the balance lies here, but we should not strive single-mindedly to eliminate friction from our lives.

5/14 Update:  “Google Assistant making calls pretending to be human not only without disclosing that it’s a bot, but adding “ummm” and “aaah” to deceive the human on the other end with the room cheering it… horrifying. Silicon Valley is ethically lost, rudderless and has not learned a thing…As digital technologies become better at doing human things, the focus has to be on how to protect humans, how to delineate humans and machines, and how to create reliable signals of each—see 2016. This is straight up, deliberate deception. Not okay.” – Zeynep Tufekci, Professor & Writer 

Lena, Fabio, and the Mess of Computer Science April 11, 2018

Posted by Peter Varhol in Publishing, Software development, Technology and Culture.
Tags: , , , ,
add a comment

The book Brotopia opens with a description of Lena, the November 1972 Playboy centerfold whose photo by chance was used in early research into image processing algorithms at USC.  Over time, that singular cropped image became a technical standard to measure the output of graphics algorithms.  Even today it is used in academic research to point out details of the value of alternative algorithms.

But today this image is also controversial.  Some complain that it serves to objectify women in computer science.  Others say it is simply a technical standard in the field.  A woman mathematics professor applied similar graphics algorithms to Fabio in an attempt to bring some balance to the discussion.

In the 8th grade (around the time of Lena), my middle school (Hopewell Junior High School) partitioned off boys to Shop class, and girls to Home Ec.  Perhaps one boy a year asked for Home Ec class, but it could only be taken by boys as a free elective, and was viewed as an oddity.  During my time there, to my knowledge no girl asked to be in Shop class.

Of course, I thought nothing of it at the time, but today such a segregation is troubling.  And even in 2015, a high school computer science class used Lena to show off their work with graphics algorithms, to mixed reviews.

There are many serious problems with the cult of the young white male in tech today.  As we continue to engage this demographic with not-so-subtle inducements to their libidos, we also enable them to see themselves as the Masters of the (Tech) Universe.  That worked out so well for the financial trading firms in the market failures of the 1980s and 2000s, didn’t it?

Does the same dynamic also make it more difficult for women to be taken seriously in tech?  I think that it is part of the problem, but by no means the only part.  Women in tech are like people in any field – they want to do their jobs, and not have to have cultural and frat boy behaviors that make it that much more difficult to do so.

I’ve been fortunate to know many smart and capable women throughout my life.  I had a girlfriend in college who was simply brilliant in mathematics and chemistry (in contrast, I was not brilliant at anything at that point in my life).  She may have been one of the inspirations that led me to continue plugging away at mathematics until I managed a limited amount of success at it.  Others try to do their best under circumstances that they shouldn’t have to put up with.

So let’s give everyone the same chance, without blatant and subtle behaviors that demean them and make them feel less than what they are.  We don’t today.  Case in point, Uber, which under Travis Kalanick was the best-known but by no means the only offender.  I hope we can improve, but despair that we can’t.

Bias and Truth and AI, Oh My October 4, 2017

Posted by Peter Varhol in Machine Learning, Software development, Technology and Culture.
Tags: ,
add a comment

I was just accepted to speak at the Toronto Machine Learning Summit next month, a circumstance that I never thought might happen.  I am not an academic researcher, after all, and while I have jumped back into machine learning after a hiatus of two decades, many more are fundamentally better at it than me.

The topic is Cognitive Bias in AI:  What Can Go Wrong?  It’s rather a follow-on from the presentations I’ve done on bias in software development and testing, but it doesn’t really fit into my usual conferences, so I attempted to cast my net into new waters.  For some reason, the Toronto folks said yes.

But it mostly means that I have to actually write the presentation.  And here is the rub:  We tend to believe that intelligent systems are always correct, and in those rare circumstances where they are not, it is simply the result of a software bug.

No.  A bug is a one-off error that can be corrected in code.  A bias is a systematic adjustment toward a predetermined conclusion that cannot be fixed with a code change.  At the very least the training data and machine learning architecture have to be re-thought.

And we have examples such as these:

If you’re not a white male, artificial intelligence’s use in healthcare could be dangerous.

When artificial intelligence judges a beauty contest, white people win.

But the fundamental question, as we pursue solutions across a wide range of applications, is:  Do we want human decisions, or do we want correct ones?  That’s not to say that all human decisions are incorrect, but only to point out that much of what we decide is colored by our bias.

I’m curious about what AI applications decide about this one.  Do we want to eliminate the bias, or do we want to reflect the values of the data we choose to use?  I hope the former, but the latter may win out, for a variety of reasons.