Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6635487
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 25, 20262026-05-25T23:03:39+00:00 2026-05-25T23:03:39+00:00

I just read about the LLVM project and that it could be used to

  • 0

I just read about the LLVM project and that it could be used to do static analysis on C/C++ codes using the analyzer Clang which the front end of LLVM. I wanted to know if it is possible to extract all the accesses to memory(variables, local as well as global) in the source code using LLVM.

Is there any inbuilt library present in LLVM which I could use to extract this information.
If not please suggest me how to write functions to do the same.(existing source code, reference, tutorial, example…)
Of what i have thought, is I would first convert the source code into LLVM bc and then instrument it to do the analysis, but don’t know exactly how to do it.


I tried to figure out myself which IR should I use for my purpose ( Clang’s Abstract Syntax Tree (AST) or LLVM’s SSA Intermediate Representation (IR). ), but couldn’t really figure out which one to use.
Here is what I m trying to do.
Given any C/C++ program (like the one given below), I am trying to insert calls to some function, before and after every instruction that reads/writes to/from memory. For example consider the below C++ program ( Account.cpp)

#include <stdio.h>

class Account {
  int balance;

public:
  Account(int b) {
    balance = b;
  }

  int read() {
    int r;
    r = balance;
    return r;
  }

  void deposit(int n) {
    balance = balance + n;
  }

  void withdraw(int n) {
    int r = read();
    balance = r - n;
  }
};

int main () {
  Account* a = new Account(10);
  a->deposit(1);
  a->withdraw(2);
  delete a;
}

So after the instrumentation my program should look like:

#include <stdio.h>

class Account {
  int balance;

public:
  Account(int b) {
    balance = b;
  }

  int read() {
    int r;
    foo();
    r = balance;
    foo();
    return r;
  }

  void deposit(int n) {
    foo();
    balance = balance + n;
    foo();
  }

  void withdraw(int n) {
    foo();
    int r = read();
    foo();
    foo();
    balance = r - n;
    foo();
  }
};

int main () {
  Account* a = new Account(10);
  a->deposit(1);
  a->withdraw(2);
  delete a;
}

where foo() may be any function like get the current system time or increment a counter .. so on. I understand that to insert function like above I will have to first get the IR and then run an instrumentation pass on the IR which will insert such calls into the IR, but I don’t really know how to achieve it. Please suggest me with examples how to go about it.

Also I understand that once I compile the program into the IR, it would be really difficult to get 1:1 mapping between my original program and the instrumented IR. So, is it possible to reflect the changes made in the IR ( because of instrumentation ) into the original program.

In order to get started with LLVM pass and how to make one on my own, I looked at an example of a pass that adds run-time checks to LLVM IR loads and stores, the SAFECode’s load/store instrumentation pass (http://llvm.org/viewvc/llvm-project/safecode/trunk/include/safecode/LoadStoreChecks.h?view=markup and http://llvm.org/viewvc/llvm-project/safecode/trunk/lib/InsertPoolChecks/LoadStoreChecks.cpp?view=markup). But I couldn’t figure out how to run this pass. Please give me steps how to run this pass on some program say the above Account.cpp.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-25T23:03:40+00:00Added an answer on May 25, 2026 at 11:03 pm

    First off, you have to decide whether you want to work with clang or LLVM. They both operate on very different data structures which have advantages and disadvantages.

    From your sparse description of your problem, I’ll recommend going for optimization passes in LLVM. Working with the IR will make it much easier to sanitize, analyze and inject code because that’s what it was designed to do. The downside is that your project will be dependent on LLVM which may or may not be a problem for you. You could output the result using the C backend but that won’t be usable by a human.

    Another important downside when working with optimization passes is that you also lose all symbols from the original source code. Even if the Value class (more on that later) has a getName method, you should never rely on it to contain anything meaningful. It’s meant to help you debug your passes and nothing else.

    You will also have to have a basic understanding of compilers. For example, it’s a bit of a requirement to know about basic blocks and static single assignment form. Fortunately they’re not very difficult concepts to learn or understand (the Wikipedia articles should be adequate).

    Before you can start coding, you first have to do some reading so here’s a few links to get you started:

    • Architecture Overview: A quick architectural overview of LLVM. Will give you a good idea of what you’re working with and whether LLVM is the right tool for you.

    • Documentation Head: Where you can find all the links below and more. Refer to this if I missed anything.

    • LLVM’s IR reference: This is the full description of the LLVM IR which is what you’ll be manipulating. The language is relatively simple so there isn’t too much to learn.

    • Programmer’s manual: A quick overview of basic stuff you’ll need to know when working with LLVM.

    • Writing Passes: Everything you need to know to write transformation or analysis passes.

    • LLVM Passes: A comprehensive list of all the passes provided by LLVM that you can and should use. These can really help clean up the code and make it easier to analyze. For example, when working with loops, the lcssa, simplify-loop and indvar passes will save your life.

    • Value Inheritance Tree: This is the doxygen page for the Value class. The important bit here is the inheritance tree that you can follow to get the documentation for all the instructions defined in the IR reference page. Just ignore the ungodly monstrosity that they call the collaboration diagram.

    • Type Inheritance Tree: Same as above but for types.

    Once you understand all that then it’s cake. To find memory accesses? Search for store and load instructions. To instrument? Just create what you need using the proper subclass of the Value class and insert it before or after the store and load instruction. Because your question is a bit too broad, I can’t really help you more than this. (See correction below)

    By the way, I had to do something similar a few weeks ago. In about 2-3 weeks I was able to learn all I needed about LLVM, create an analysis pass to find memory accesses (and more) within a loop and instrument them with a transformation pass I created. There was no fancy algorithms involved (except the ones provided by LLVM) and everything was pretty straightforward. Moral of the story is that LLVM is easy to learn and work with.


    Correction: I made an error when I said that all you have to do is search for load and store instructions.

    The load and store instruction will only give accesses that are made to the heap using pointers. In order to get all memory accesses you also have to look at the values which can represent a memory location on the stack. Whether the value is written to the stack or stored in a register is determined during the register allocation phase which occurs in an optimization pass of the backend. Meaning that it’s platform dependent and shouldn’t be relied on.

    Now unless you provide more information about what kind of memory accesses you’re looking for, and in what context and how you intend to instrument them, I can’t help you much more than this.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I recall I have read about a parser which you just have to feed
I just got Delphi 2009 and have previously read some articles about modifications that
I would swear on my cat's grave that I just read about such a
I just read about zip bombs , i.e. zip files that contain very large
Just read some posts about how wonderful Silverlight 3.0 is, including that it uses
I just read about mysql's SQL_CALC_FOUND_ROWS and FOUND_ROWS() which helps you get the total
I just read on Wikipedia about elementary abelian groups which appear to be related
I've just read about MVC3's new CompareAttribute that you can apply to a model's
I just read some basics about Molehill, and I understood that learning an API
I'm currently using Backbone.js for my webapp but I just read about batman.js and

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.