Is artificial intelligence (AI) cyber safe?

18.09.18 08:57 PM Comment(s) By Jordan

While there are many benefits associated with the Fourth Industrial Revolution, one of the challenges that it imposes is that the security of computer systems, mobile technology, and individuals, will be increasingly compromised in the future. Towards the end of August, Microsoft fought off Russian hackers.

This is not even a case of it may compromise some and not others, we will all eventually face some form of a treat at some stage or another. Look at the recent racism scandal surrounding Adam Catzavelos; within hours of his video going viral, someone posted his whole identity online. This included his name, surname, ID number, street address and even his telephone number.

While this was an overt act of breaching someone’s security (and we can have a debate on whether this was necessary at a later stage), there are cases where security is breached without our knowledge.

The world of laziness

You have just arrived home from putting in a 14 hour shift at the office; your boss has been chewing your ear off the whole day, and your dog has ripped your couch to shreds. The only thing you want to do at this stage is collapse on the couch and have the following conversation…

You: “Alexa…please order me a large margarita pizza.”

Alexa: “Would you like extra cheese with that?”

You: “Please add extra cheese.”

Alexa: “Dominos Pizza has a special where if you order a large pizza you can get 8 mozzarella sticks for and extra R20.”

You: “Fine.”

Alexa: “The order will come to R100 and delivery will take 45 min.”

The fact that we can have these types of conversations in this day and age is a remarkable indication of how far we have come as a species, and how much Artificial Intelligence (AI) has evolved.

However, there are security risks associated with AI. I recently read an article on threatpost.com which pointed these out in quite an eye opening manner.

Early warning system

The article points out that while artificial intelligence and machine learning are far from new, many in security suddenly believe these technologies will transform their business and enable them to detect every cyber threat that comes their way. But instead, the hype may create more problems than it solves.

Recently, cybersecurity firm ESET surveyed 900 IT decision makers on their opinions of artificial intelligence and machine learning in cybersecurity practices.

The article adds that, according to the research, “the recent hype surrounding artificial intelligence (AI) and machine learning (ML) is deceiving three in four IT decision makers (75 percent) into believing the technologies are the ‘silver bullet’ to solving their cybersecurity challenges.”

The hype, ESET says, causes confusion among IT teams and could put organizations at greater risk of falling victim to cybercrime. According to ESET’s CTO, Juraj Malcho (who spoke to threatpost.com), “when it comes to AI and ML, the terminology used in some marketing materials can be misleading and IT decision makers across the world aren’t sure what to believe.”

Looking past the hype cycle, IT teams can achieve real value from machine learning and artificial intelligence available today.

Types of ‘Learning’

The article points out that, dsespite what marketing-speak says, there are different ways to implement machine learning – supervised or unsupervised learning.

In supervised learning, specific data is collected and defined output is used to create programs. This requires actual training of the system. In other words, a human must provide the expected output data to make the system useful. Most IT teams are reluctant to do this because it doesn’t remove the human from the system.

The article adds that unsupervised learning is what the market is looking for, as it does remove the human. You don’t need the output in this model. Instead, you feed data into the system and it looks for patterns from which to program dynamically.

Ask the Right Questions

Most IT teams want to simply ask broad questions and get results to queries like, “find lateral movement.” Unfortunately, this is not possible today.

The article points out that you can use ML/AI to identify characteristics of lateral movement by asking questions like “Has this user logged in during this timeframe?” “Has the user ever connected to this server?” or “Does the user typically use this computer?” These types of questions are descriptive, not predictive. They infer answers by comparing new and historical data.

Analysts follow an attack down a logical path and ask questions at each step. Computers identify deviations from baselines and determine the risk level tracing the anomalies. This is the intersection where machines and humans come together for better results.

What Can Be Done Today With ML/AI?

The article adds that, in reality, you must identify a strong baseline of the data structure to get value from ML/AI. Only then can you evaluate input data and make associations between the input data and the normal state of the network.

Here are threats that ML/AI can identify:

DNS Data Exfiltration. While this is difficult to prevent, it is easily detected because the system can examine DNS traffic and know when DNS queries go to an authoritative server, but don’t receive a valid response. When queries like 0800fc577294c34e0b28ad2839435945.badguy.examplenet are sent many times from a given network machine, the system can alert IT professionals; and
Credential Misuse According to Verizon’s 2018 Data Breach Investigations report, humans are one of the biggest problems for organizations. Ninety-six percent of attacks come from email. On average only 4 percent of people fall for any given phishing attack, but a malicious actor only needs one victim to provide credentials.

The article points out that machine learning is useful here because the users have been baselined. Those users connect to and log in to a set number of devices each day. It’s easy for a human to see when a credential is tried hundreds of times on a server, but it’s hard to catch someone that tries to connect to 100 different machines on the network and only succeeds once.

Separating real from hype

The article adds that while we are far from a type of artificial intelligence that can solve all cybersecurity problems, it is important to understand what’s real and what’s hype. As Malcho stated, “the reality of cybersecurity is that true AI does not yet exist. As the threat landscape becomes even more complex, we cannot afford to make things more confusing for businesses. There needs to be greater clarity as the hype is muddling the message for those making key decisions on how best to secure their company’s networks and data.”

Ultimately, the best solutions will be a combination of both supervised and unsupervised learning models: leveraging supervised learning to identify granular patterns of malicious behavior, while unsupervised algorithms develop a baseline for anomaly detection. Humans will not be eliminated from this equation any time soon.

The truth and untruths

How much is AI a part of the security of your system? An article on csoonline.com spells it out.

The article points out that we need to start by dispelling the most common misconception: There is very little if any true artificial intelligence (AI) being incorporated within enterprise security software. The fact that the term comes up frequently is largely to do with marketing, and very little to do with the technology. Pure AI is about reproducing cognitive abilities.

That said, machine learning (ML), one of many subsets of artificial intelligence, is being baked into some security software. But even the term machine learning may be employed somewhat optimistically. Its use in security software today shares more in common with the rules-based “expert systems” of the 1980s and 1990s than it does with true AI. If you’ve ever used a Bayesian spam trap and trained it with thousands of known spam emails and thousands of known good emails, you have a glimmer of how machine learning works, if not the scale. In most cases, it’s not capable of self-training, and requires human intervention, including programming, to update its training. There are so many variables in security, so many data points, that keeping its training current and therefore effective can be a challenge.

Effective when trained

The article adds that machine learning, however, can be very effective when it is trained with a high volume of the data from the environment in which it will be used by people who know what they’re doing. Although complex systems are possible, machine learning works better at more targeted tasks or sets of tasks rather than a wide-ranging mission.

One of machine learning’s greater strengths is outlier detection, which is the basis of user and entity behavior analytics (UEBA), says Chris Kissel, IDC research director, global security products. “The short definition of what UEBA does,” he adds, “is determining whether an activity emanating from or being received by a given device is anomalous.” UEBA fits naturally into many major cybersecurity defensive activities.

The article points out that when a machine learning system is trained thoroughly and well, in most cases you’ve defined the known good events. That lets your threat intelligence or security monitoring system focus on identifying anomalies. What happens when the system is trained by the vendor solely with its own generic data? Or it is trained with an insufficient volume of events? Or there are too many outliers that lack identification, which become part of a rising din of background noise? You may wind up with the bane of enterprise threat-detection software: an endless succession of false positives. If you’re not training your machine-learning system on an ongoing basis, you won’t get the real advantage that ML has to offer. And as time goes by, your system will become less effective.