Gokart : A Static Analysis Tool For Securing Go Code

0
30

GoKart is a static analysis tool for Go that finds vulnerabilities using the SSA (single static assignment) form of Go source code. It is capable of tracing the source of variables and function arguments to determine whether input sources are safe, which reduces the number of false positives compared to other Go security scanners. For instance, a SQL query that is concatenated with a variable might traditionally be flagged as SQL injection; however, GoKart can figure out if the variable is actually a constant or constant equivalent, in which case there is no vulnerability.

Why We Built GoKart

Static analysis is a powerful technique for finding vulnerabilities in source code. However, the approach has suffered from being noisy – that is, many static analysis tools find quite a few “vulnerabilities” that are not actually real. This has led to developer friction as users get tired of the tools “crying wolf” one time too many.

The motivation for GoKart was to address this: could we create a scanner with significantly lower false positive rates than existing tools? Based on our experimentation the answer is yes. By leveraging source-to-sink tracing and SSA, GoKart is capable of tracking variable taint between variable assignments, significantly improving the accuracy of findings. Our focus is on usability: pragmatically, that means we have optimized our approaches to reduce false alarms.

For more information, please read our blog post.

Introducing GoKart, A Smarter Go Security Scanner

At Praetorian, we’re committed to promoting and contributing to open source security projects and radically focused on developing technologies to enhance the overall state of cybersecurity. We love when our passions and business commitments overlap so today we’re stoked to announce the initial release of GoKart – a smarter security scanner for Go.

GoKart is our first foray into our new open source security strategy where we aim to seed the community with tools containing a set of baseline capabilities in the hope that it will spur further progression. Rather than attempting to craft rules for specific security concerns, we’ve focused on the release of several high-level analyzers using the Go analysis package which provide capabilities we’ve found missing from existing open source projects. Our goal is to engage and excite the community with this first release with additional features based on direct user feedback. The vision is to become the manufacturing and maintenance organization for the GoKart engine – allowing others to focus on fine tuning and building the cart while driving a higher performance machine.

Static analysis tools are a key part of a modern development pipeline and used in various forms throughout the development lifecycle. In IDEs, syntax checkers catch errors before you even click the Compile button. Behind the scenes, they determine whether source code has a valid form and structure, resolve type information, and perform optimization during compilation. Even code autocompletion methods are based upon simple static analysis that helps prompt the programmer for what goes next. All these itools are great… but where things get exciting, at least for us, is when we apply these approaches to source code for the purposes of identifying security vulnerabilities. Done right, static application security testing (SAST) has the potential to reduce costs at the same time as improving security and productivity. That’s a pretty good outcome.

Compared to dynamic analysis, which actually runs a program, requiring code to be complete and in a fully buildable state, static testing is much more suitable to perform early and often within the development process. Since static analysis only considering the source code, there is no need to set up a custom testing environment, alleviating the need for costly and complex replication and sandboxing of a production web server, firewall, microservices, etc. By its very nature, static analysis provides visibility and analysis coverage of any source file contained in a local development build. Although more advanced static analysis techniques require the creation of custom rules and configurations which are complex to setup and use effectively, the truth that often gets lost in the noise is that static application security testing (SAST) can also be very fast, easy to use and are essential to delivering high quality code and enforcing consistent secure coding standards.

In a security context, SAST can be instrumental in detecting common insecure programming patterns early and enforcing secure coding standards throughout the development lifecycle. SAST has the benefits of being scalable and fast, allowing it to be integrated into a CI/CD pipeline. Praetorian offers its own CI/CD security platform, Chariot, which can apply SAST to every commit. As GoKart evolves it will be included as one of Chariot’s available scanners to add additional context for developers, allowing them to not only find issues quickly, but also give them helpful information to allow them to resolve them. Better still, this service is provided free and is foundational to our comprehensive view into the security posture of infrastructure and code across an enterprise.

For the past decade, static analysis techniques have evolved from their humble origins. Whereas early linting services may have just applied simple RegEx rules to code, more modern approaches leverage data flow analysis, where user controllable data is tracked from user input through a call graph representation of the application and propagated to all functions known to be susceptible to a particular type of exploit. At the bleeding edge of research, static analysis techniques developed in the academic realm have shifted to use of symbolic execution, model checking, constraint analysis and formal methods to create much more powerful capabilities for modeling and evaluation of source code. Meanwhile back in the real security world, commercial tools have struggled to really leverage these more complex and computationally intensive techniques, generally opting for increased language breadth over analysis depth. Worse yet, for the rest of the development world which relies on scouring Github to find and customize our security tools, we’ve discovered that even the data flow techniques pioneered 20 years ago haven’t yet broken out of their corporate cages. Instead, the majority of open source SAST tools have reverted to a grep-like pattern matching strategy, either on the source code directly or on an Abstract Syntax Tree (AST) representation of the program.

Why Did We Make GoKart?

At Praetorian, we eat our own dog food. Given our history of using Go for our offensive tooling development and our recent shift from Java Spring based micro-services to a more efficient, flexible and secure, fully containerized, Kubernetes based architecture using Go to streamline the Chariot platform, we felt strongly about improving the current state of automated Go security analysis.

Over the past decade, commercial SAST tools have gained a reputation for being overly complex to use, noisy and inaccurate, and costly to acquire and maintain. Their compiler based analysis engines, which worked well for statically typed languages like C++, Java and C#, have struggled to adapt their techniques to dynamically typed languages like JavaScript or Python and have been slow to embrace the cloud native paradigms of Docker, Kubernetes and Go. Open source security scanners, on the other hand, have been created in swarms but are typically not sophisticated enough to prove that a given finding was really a security threat or work reliably and accurately enough to be trusted to be run in an automated, unaided manner.

The most notable challenge to the adoption of all the tools is a high false positive rate and a lack of proof showing exactly why a flagged item is vulnerable. For example, many security scanners will simply report that a particular line of code has a security problem without showing the path to exploitation that an attacker would take. Other tools have more evolved much more complex capabilities but require both security acumen and query language programming expertise. In practice, these shortcomings have contributed to false positive fatigue and a mentality of needing to wrestle with the scanner until the warnings went away. GoKart aims to address these issues by providing a user friendly, more accurate and less noisy experience and helping developers discover and understand full attack paths for high impact issues quickly and confidently.

In creating GoKart, we were inspired by gosec, currently the most widely used Go security scanner, were impressed by its ease of use but wanted to see if we could improve upon its current results. Gosec contains thirty rules that apply pattern matching to an abstract syntax tree (AST) representation of Go code. Using the language’s AST helps gosec know exactly where each expression, constant, and function is in relation to other language constructs and prevents any sort of “misread” of code structure. On the security front, gosec handles a variety of issues from SQL injection and decompression bombs to short cryptographic key lengths and outdated TLS settings. The main analysis capability which gosec currently lacks is the ability to perform ‘taint tracking’ or determining code paths where user controllable data could potentially reach a vulnerable function. Addition of taint tracking would allow rules to be written in a way to greatly reduce false positive results associated with more simplistic signature, text or AST-pattern matching. Additionally, gosec may not reveal a potential attack path if the problem isn’t contained exactly where it is expected, producing a false sense of security for users and provoking further mistrust among security professionals. For instance, an adversary might have control over a string which is later used to construct a query leading to SQL injection far earlier in the program execution than when the SQL query gets executed; similarly, a constant value used as a size parameter for creation of an RSA key could be initialized by one function, modified by a second, before being used by the third. In each case, the attack path or security flaw might originate in a different function or even a different file, and without taint tracking or data flow analysis these conditions will likely go undetected. Picking up where gosec leaves off, GoKart first identifies potentially vulnerable functions in source code, and then traces the input of those functions back to their source. If the input source may be controlled by a user (such as in SQL injection), or if the input source is otherwise defined as “vulnerable” (like a short key length in an RSA key generator), GoKart will output the vulnerability.

How Does GoKart Work?

When designing GoKart our focus was to provide visibility into high impact findings in Go which provides significant value to our security engineers performing code reviews in the field as well as our own developers building new tools. We added capabilities to perform a lightweight version of taint propagation and analyzers utilizing these for several of the most interesting and prevalent vulnerabilities we find when performing manual code reviews on applications developed in Go: Command Injection, Path Traversal and Server Side Request Forgery. By adding the ability to customize GoKart with new Sinks for creating additional vulnerability types as well as Sources of user input tailored to a specific enterprise threat model. Based on our limited testing, we believe we’ve struck the correct balance between tool flexibility and usability, providing a delightful out-of-the-box experience with little to no configuration needed for surfacing high impact vulnerabilities with significantly better noise to signal ratio.

GoKart uses the Go analysis package to build a call graph and puts Go code into single static assignment (SSA) form, structuring every value computed by the program as an assignment to a unique variable. SSA is used in compilers for optimization, and in a security context it can help us trace back the source of data used as input. Doing analysis in SSA form has a few benefits over simply using an AST. GoKart’s SSA form is better for looking at data flow, since all value assignments are done exactly once. Being able to follow data as it flows through a program, weaving in and out of objects and modules, is one of GoKart’s primary features, and it is what makes GoKart so powerful. It can trace into all included packages and modules. Traversing the call graph in SSA form also simplifies code structure and only requires us to handle SSA primitives instead of all Go types.

SSA also has the benefit of making constant propagation possible during analysis. Some misconfiguration and design vulnerabilities are only applicable if a certain parameter is used, such as creating an RSA key whose length is too short, making the key crackable. Static analysis traditionally would not be able to evaluate expressions that are not literals, but with constant propagation, constant properties and parameters can be evaluated without running the code. Thus, if you passed a variable instead of a literal into `rsa.GenerateKey`, security scanners couldn’t be sure if there was really an issue. Now, given that the variable is a constant or a constant whose manipulations are calculable at compile time, GoKart can determine what that RSA key length is. GoKart is thus able to accommodate different programming styles and is not limited to certain expectations about how code is written, such as expecting a literal argument to a function.

GoKart contains a customizable list of input sources and vulnerable sinks, and since it does taint tracking, it can show exactly where in code a vulnerable input source is being fed into the application. Taint tracking not only greatly reduces the false positive rate of static analysis but also makes remediation much easier using the data path GoKart produces.

Despite making some advancements in using SSA for constant and taint propagation, our AST-based call graph implementation has many of its own limitations. Without proper call flow graph (CFG) construction, our taint analysis won’t properly consider all paths a computer program will branch into; for instance, leading to flow insensitivity within methods as well as cases in which nodes from two branches are incorrectly found in a single call path. There is also the need to perform a level of pointer analysis to more accurately model the concurrency of Go channels – which we are currently over-approximation by assuming data returned from a channel is tainted, leading to potential false positives. Global variables also provide a formidable challenge, since they break down SSA’s assumptions about potential state changes.

Results

For four experimental vulnerability types, GoKart is able to reduce both the false negative and false positive rates over other Go scanners. In particular, GoKart is more accurate than gosec because it operates using taint tracking, makes fewer assumptions about programming styles, and only alerts when a potential vulnerability actually comes from an input source that is considered user-controllable or having the potential to be malicious.

Moving from our experimental testbed to a sample vulnerable application showed that our intuitions on noise reduction and signal amplification hold true. Scanning the go-test-bench application developed by Contrast Security demonstrates a significant improvement in signal to noise ratio https://github.com/Contrast-Security-OSS/go-test-bench, with GoKart finding 8 true positives from our three most common vulnerability types: Path Traversal, Command Injection and Server Side Request Forgery (SSRF), each with supporting evidence in the form of traces from user-controllable input to the vulnerable function.

Trace shows Handler method receiving pointer of type httpRequest and assigning it to userInput and is eventually used in call to vulnerable function ioutil.WriteFile()

Trace shows function osExecHandler receiving pointer of type http.Request, assigning this to userInput which then is directly used in a call to vulnerable method exec.Command()

Trace shows function httpHandler receiving a painter of type http.Request, assigning this to userInput which is then used to create a URL used in a call to vulnerable method http.Get().

At first blush, the overall results from gosec running with only the equivalent checks are quite similar (7 total results; no check exists for SSRF) but drilling a bit further into a specific Command Injection vulnerability identified by gosec but missing from GoKart demonstrates the value of properly tracking user input to a vulnerable function:

While it seems reasonable to flag this as a vulnerability based only on the call site, since this *would* be a vulnerability if the userInput variable came from an externally controllable source (e.g. http.Request).

However, tracing through the code clearly shows that userInput is clearly a local variable created from within the function directly before it is used, with no potential for malicious input to reach the vulnerable function and thus this classification is a False Positive result which requires some level of security expertise to identify and which exists even in such a small and intentionally vulnerable application.

Moving from the test track and driving GoKart in the real world gives us a sense of how it will perform on large enterprise codebases. We’ve started scanning some of our favorite Go applications and have found the results to be quite inspiring from a usability standpoint.

Running on grpc-go (https://github.com/grpc/grpc-go) shows that GoKart shows only 2 Path Traversal findings, which both seem reasonable, except that they are found in the benchmark test and thus not something we would report to a customer. The fact that the entirety of the scan with results can be shown in a single page screenshot gives us that warm fuzzy feeling that we’re on the right path here:

We’re just now racing the GoKart back and forth around GitHub and have found the results as well as the overall driving experience to be something worth sharing with the world. We plan to take a deeper dive into some of the real world findings and share those in the near future but for those who are interested in a preview, here are some race results (project details have been redacted to practice our policy of responsible disclosure):

Trace on real world project 1 ran for 8 seconds, scanned 3,830 files, and identified 16 potential vulnerabilities
Trace on real world project 2 ran for 1 minute 5 seconds, scanned 7,307 files, and identified 21 potential vulnerabilities
Trace on real world project 3 ran for 4 seconds, scanned 1,570 files, and identified 13 potential vulnerabilities

Install

You can install GoKart locally by using any one of the options listed below.

Install with go install

$ go install github.com/praetorian-inc/gokart@latest

Install a release binary

  • Download the binary for your OS from the releases page.
  • (OPTIONAL) Download the checksums.txt file to verify the integrity of the archive

# Check the checksum of the downloaded archive
$ shasum -a 256 gokart_${VERSION}${ARCH}.tar.gz b05c4d7895be260aa16336f29249c50b84897dab90e1221c9e96af9233751f22 gokart${VERSION}${ARCH}.tar.gz $ cat gokart${VERSION}${ARCH}_checksums.txt | grep gokart${VERSION}${ARCH}.tar.gz b05c4d7895be260aa16336f29249c50b84897dab90e1221c9e96af9233751f22 gokart${VERSION}_${ARCH}.tar.gz

  • Extract the downloaded archive

$ tar -xvf gokart_${VERSION}_${ARCH}.tar.gz

  • Move the gokart binary into your path:

$ mv ./gokart /usr/local/bin/

Clone and build yourself

#clone the GoKart repo
$git clone https://github.com/praetorian-inc/gokart.git
#navigate into the repo directory and build
$cd gokart
$go build
#Move the gokart binary into your path
$mv ./gokart /usr/local/bin

LEAVE A REPLY

Please enter your comment!
Please enter your name here