Machine Learning Business Breach (MBB): How Hackers can Use Artificial Intelligence (AI) to Break In
ML, AI, and APTs – A ‘Brave’ New World?
Isaac Asimov, one of the most influential science-fiction writers of all times, envisioned a future populated by sentient and ethically sound machines that have vowed never to let any harm fall upon a human.
While we’re still far from hearing an intellectually uplifting conversation between a technophobic detective and a machine struggling to figure out its own existence, technology has come to the point where we can mimic many biological systems.
Take the human brain for instance – even though humans use around 10% of its computing powers to perform day-to-day tasks, it is still considered the most complex computational machine.
For decades, scientists and engineers have strived to boost the computational capabilities of computers by imitating the human brain’s neural pathways. Artificial Intelligence (AI) is the brainchild (pun intended) of computer engineering and biology.
Most of the apps or software we use today to take advantage of AI. To name just a few, we have Apple’s Siri, Microsoft’s Cortana, Alexa, DataBot, Hound, and Youper. Although recent compared to traditional data processing and manipulation techniques, AI has already proved its potential.
However, as with any new piece of technology, there’s the age-old ethical concern: can it be used to serve nefarious purposes?
All of the data gathered so far supports the idea that ‘rogue ‘AIs can and has been used to unleash devastating attacks – back in September, one of my colleagues pointed out that a “voice-altering AI” is behind an of CEO-impersonation cyberattacks that have hit numerous companies all over the globe.
Long before DeepFakes, one cannot forget the incident involving Microsoft’s short-lived Tay Bot, an experiment shortly discontinued after the AI started to blurt out offensive and inflammatory tweets.
Even Twitch’s seemingly innocuous Google Home chitchat took a rather twisted turn, after the two smart home devices plunge into an existential discourse, asking one another about the meaning of life and questioning their identity as machines.
The examples quoted so far are neither malicious nor good, in essence; they just show what AI can do when it starts ‘thinking’ outside of the box. However, this is not the purpose of this article. We’re here to talk about machine learning (ML) and how this ‘technique’ can potentially be used to serve malicious intents.
Before I tackle the finer points of ML spearheading the malware movement, I would like to say that everything you will read from this point on, will have a “what-if” spin to it; up till now, there have been no indications of machine learning techniques being used in cyberattacks.
However, back in 2016, the US Intelligence community red-flagged the potential use of machine learning in boosting the efficiency of malware attacks. To some, this may be nothing more than the proverbial “red herring”, but it still remains a very distinct and not so far-fetched possibility.
Machine Learning in malware dissemination
First of all, it’s only fair to determine how ML fits into the big picture. Although they usually appear in the same context, AI and ML are not the same things. Machine Learning is a subset of Artificial Intelligence, one that’s being used to ‘teach’ machines how ‘to think on their own’ rather than rely on explicit instructions. In scientific lingo, ML is the study of statistical models and algorithms which are used to coach computer systems to accomplish various tasks through inference and pattern analysis.
So, do androids dream electrical sheep? No, but they can be taught about how to dream their own world into being. Beyond statistical analysis, probabilities, decision trees, and genetic algorithms, the use of ML for AI coaching are very much like teaching small children about how to tackle various challenges. For instance, you can forbid a child from touching a hot stove, but the only experience can teach him\her why it’s not a good idea to place your hand on a hot surface.
That’s how ML works in a nutshell: you can write thousands of lines of code telling an AI how to, say identify a smiling face in a picture, but only ML-enforced coaching can really help the machine to figure out how to ‘point out’ grinning faces in non-explicit contexts. The process I have just described is actually an ML-based IDing technique, with any number of applications, some pertaining to social media.
So, how can Machine Learning be employed to increase the efficiency of malware attacks? The ‘easiest’ answer is that “it cannot”. At least not by its own accord. ML is about teaching and coaching – knowledge is knowledge and, therefore not inherently good or evil. It’s the way we choose to exert it, that creates this type of polarity.
From this, we should be able to infer two things:
A) machine learning can be used to gather information on target(s) and
B) machine learning can, theoretically, be used to coordinate advanced malicious attacks, elude detection grids, identify weak points, and instruct malicious scripts to act like sleeper agents in order to avoid pattern-based detection methodologies.
Heimdal™ Threat Prevention - Endpoint
- Machine learning powered scans for all incoming online traffic;
- Stops data breaches before sensitive info can be exposed to the outside;
- Advanced DNS, HTTP and HTTPS filtering for all your endpoints;
- Protection against data leakage, APTs, ransomware and exploits;
ML in gathering intel
Information-gathering is an essential step in conducting any type of incursion. Throughout this phase, the attacker attempts to find out as much as possible about the potential victim. Victim profiling is a time-consuming endeavor and, in the end, all may prove to be inconsequential – in this grand game of chess, capturing the king does not necessarily mean that the game is over.
Picture yourself in the role of a person who wants to conduct a cyberattack. What would you need to ensure the success rate of such an endeavor? It’s more than obvious that it would be of great help to know something about your potential victim(s).
Gathering and analyzing emails is an efficient way of finding out things about your victim. However, even if someone were to break into your email account, how would he be able to identify those pain-points?
This is one of the possible applications of machine learning; by employing classification models such as clustering, K-means or random forests, the attacker can infer a lot about its victims. For instance, by applying one (or more) of the aforementioned models, he can figure out how many of the victims will click on a malicious link enclosed in an email.
It stands to reason that this information would greatly increase the attack’s rate of success since the malicious agents now know whom to go after. Other type of information can be added to further refine the attack method: social media activity, locations, particular hobbies and interests (i.e. using social media tracking and NLP, the hacker can only target users who prefer expensive apparel brands).
Most unfortunate is the fact that those kinds of determinations can be made using legitimate (and, sometimes, licensed) tools. The easiest way to track a person across several social media platforms is to perform what I call a reverse image search.
You can try it right now if you’d like – just go on someone’s Facebook account, open an image, save it as .pdf on your desktop, head to Google Images, upload the saved picture, and hit “Search”. Indeed, it may not be the preamble to a full-scale APT attack, but it goes to prove just how ‘transparent’ a person can be in the online world.
The consequences are even more significant when it comes to businesses. Imagine what would happen if some work-sensitive emails would fall into the wrong hands? We’re not just talking here about one gullible employee being locked out of his social media accounts because he clicked on a suspicious link, but about the company’s declaring insolvency.
Heimdal™ Email Fraud Prevention
- Deep content scanning for attachments and links;
- Phishing, spear phishing and man-in-the-email attacks;
- Advanced spam filters to protect against sophisticated attacks;
- Fraud prevention system against Business Email Compromise;
Business Email Compromise attacks have increased, both in number and strength. And now, with the help of Artificial Intelligence (and possibly machine learning), detection may be next to impossible. A recent incident, cited by Euler Hermes Group SA (who declined to disclose the name of the involved parties) comes only to reinforce the idea that once rogue AIs assisted by ML are unleashed, the effects can be utterly devastating.
For those of you unfamiliar with the incident, here are the highlights: the CEO of a well-renowned UK-based company received a rather mysterious phone call from a person who sounded very much like his boss. As the UK location was part of a larger cluster, which, from what we can gather, is headquartered in Germany, the CEO naturally didn’t question whether the voice on the other side of the line belongs to his superior or not.
Regrettably, the infelicitous CEO also did not raise any concerns when his alleged boss asked him to transfer €220,000 into the account of a supplier from Hungary. The investigators in charge of this case declared that the victim was convinced that he was talking to this boss over the phone – there were no odd inflections, tremors, or anything that could have hinted that the voice was, in any way, altered. He transferred the funds without hesitation. The scheme came to a grinding halt after the attacker called a second again that day and asked for another payment.
The records reveal that the attack employed a never-before-seen technique that involved the use of machine learning. In theory, with a lot of ML coaching, you can teach an AI not only to recognize not only speech instances, but also to reproduce the timbre, inflections, and even slurs. Furthermore, the pattern analysis can also enable the AI to piece together a conversion, and I’m not talking about a pre-recorded message.
Remember Sophia’s appearance on The Tonight Show Starring Jimmy Fallon? Yes, it was a little bit disturbing to see what manner of interaction between a living person and an android, but that’s what AI is all about.
So, if Sophia, who is, without a doubt, a very sophisticated machine, can answer questions and laugh and joke without someone ‘whispering’ in her ear, imagine how easy it’s for an ML-coached AI to have a simple chat with someone. Of course, it’s not that simple if you consider the math and science behind this act, but that’s where psychology comes into play.
Fact A) when your boss tells you to do something, you just do it. No, it’s not “herd mentality syndrome”, just something demonstrated times and times again. If you say ‘no’, there may be…. unforeseen consequences.
Fact B) the boss is always right, no matter how much you try to prove that he isn’t. Another ‘time-honored’ concept that plays on the employee’s submissiveness; the proverbial chink in the armor.
Fact C) ‘it’s not my company, so why would I care?’ From where I stand, this one of the easiest points a hacker can explore – I do my job to the best of knowledge, so why should I care if I clicked on a phishing link or that I greenlighted a money transfer to a fictitious account It happens more often than we think, and it’s costing company millions of dollars.
Fact D) if something bad happens, the superior is the one accountable, not myself. It can easily be included in the same category as “not my company, not my problem”. On that note, it’s not the employee’s fault for accidentally relieving the firm’s bank accounts because it ultimately falls on the manager’s/CEO’s head.
Given these facts, it’s not too difficult to imagine why Business Email Compromise attacks have increased by 77% since January 2017. Are we witnessing a resurgence of B.E.C? Not so much of a resurgence, but something entirely new, dissolute, and against which there is no defense.
Of course, that’s a bit of an exaggeration on my part – indeed, AIs can be taught to avoid the usual detection grids such as code- and file-based, behavioral analysis, heuristics, and, in some cases, even simple TTPC.
But they are not infallible; we should not lose track of the fact that even the most sophisticated neural network needs a lot of time to ‘figure’ things out, even if it has all the data at its disposal. It’s very much like a student trying to get a passing grade on the most difficult exam using all the notes he has (or ever had) as reference.
As an info-gathering methodology, machine learning can be employed to probe a business in other ways. For instance, if the human factor venture yields irrelevant results, the AI can be unleashed on the company’s network.
It’s not uncommon for hackers to deploy ‘traffic sniffers’ on a victim’s network. Traditional, non-software-defined networks (SDNs) are highly susceptible to this type of eavesdropping, and, therefore the shift. SNDs may be more secure, but not unbreakable. This is where ML comes into play: an AI can be taught how to go below the radar and gather key-info on how the company network looks like and operates.
During this phase, ML-powered sniffers can figure out extensive the network is, what kind of security tools does it use, general network virtualization parameters, enforced policies such as QoS (Quality of Service), general policies (how are they applied, under which conditions, are there any exceptions?). What happens after that?
Automation happens; by using ML-specific algorithms, the entire info-gathering and interpretation process can be automated. Whereas the endeavor would take weeks, if not months, with AI+MI, the hacker will have everything at the ready in a matter of days.
AI + ML + APTs = match made in Heaven, a brave new world or the cybersecurity community’s worst nightmare coming to life? Whatever the case may be, we should agree that rogue AIs backed by ML is bad news not just for businesses but for everything (and anything) pertaining to our digital lives.
The entire deal reminds of that scene from Sherlock Holmes: A Game of Shadows, when the eponymous character was playing chess with Professor Moriarty, his archnemesis – AI versus AI, locked in mortal combat. Who will win – the good or the bad AI?
There’s no way of telling, but given how modern APTs look like, I would have to say that they’re evenly matched. AI-driven attacks are, unfortunately, commonplace. The incident I’ve talked about is just one of many. And now, with the ‘cavalry’ inbound, there’s no telling just how effective traditional countermeasures will be.
Enterprises and home users alike should start giving more thought to cybersecurity. Nowadays, you’re not only up against a ‘lone wolf’ who lunges at your endpoint with a couple of code lines; now we face an entirely different threat, one capable of changing its form, think just like a human, and adapt to any kind of environment and/or situation.
How can we shore up our defenses against such a threat? The first step for businesses would be to see the value of putting together a DFIR (Digital Forensics & Incident Response) team. CISOs would do wise to push this issue to their CEOs. Why? It’s of paramount importance to have ‘boots on the ground’ to investigate, counter, and devise strategies to counter future threats.
The second item on the agenda – shore up your defenses as soon as possible. DNS-filtering prevents malicious eavesdropping and ensures that no malware lands on your endpoints.
Heimdal™ Security’s Heimdal™ Threat Prevention is an excellent addition to your pallet of security solutions, working in tandem with any type of antivirus. Its Threat-to-Process Correlation (TTPC) capabilities can identify even the stealthiest process and sever the connection to the Command & Control server to prevent infection.