Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6782141
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 26, 20262026-05-26T16:42:22+00:00 2026-05-26T16:42:22+00:00

I have tried the mathematica code for making the chaos game for DNA sequences

  • 0

I have tried the mathematica code for making the chaos game for DNA sequences posted in this address:
http://facstaff.unca.edu/mcmcclur/blog/GeneCGR.html

which is like this:

genome = Import["c:\data\sequence.fasta", "Sequence"];
genome = StringReplace[ToString[genome], {"{" -> "", "}" -> ""}];
chars = StringCases[genome, "G" | "C" | "T" | "A"];
f[x_, "A"] := x/2;
f[x_, "T"] := x/2 + {1/2, 0};
f[x_, "G"] := x/2 + {1/2, 1/2};
f[x_, "C"] := x/2 + {0, 1/2};
pts = FoldList[f, {0.5, 0.5}, chars];
Graphics[{PointSize[Tiny], Point[pts]}]

the fasta sequence that I have is just a sequence of letters like AACCTTTGATCAAA
and the graph to be generated comes like this:

enter image description here

the code works fine with small sequences, but when I want to put a huge sequence, for example almost 40Mb of a chromosome, the program takes a lot of time and only displays a black square so that it is impossible to analyze.
Is it possible to improve the aforementioned code, so that the square in which it would be displayed it would be bigger?, by the way the square must be only the square unit.
Thanks for your help in advance

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-26T16:42:23+00:00Added an answer on May 26, 2026 at 4:42 pm

    Summary of the incremental edits below:

    This will give you a considerable speedup in computing the point coordinates by using compiled code (50x excluding computing shifts):

    shifts = chars /. {"A" -> {0., 0.}, "T" -> {.5, 0.}, "G" -> {.5, .5}, "C" -> {0, .5}};
    fun1d = Compile[{{a, _Real, 1}}, FoldList[#/2 + #2 &, .5, a], CompilationTarget -> "C"]
    pts = Transpose[fun1d /@ Transpose[shifts]];
    

    The bottleneck in your code is actually rendering the graphic, we instead of plotting each point, we’ll visualize the density of points:

    threshold = 1;
    With[{size = 300}, 
     Image[1 - UnitStep[BinCounts[pts, 1/size, 1/size] - threshold]]
    ]
    

    A region will be coloured black if it has at least threshold points. size is the image-dimension. By either choosing a large size or a large threshold you can avoid the “black square problem”.


    My original answer with more details:

    On my rather dated machine, the code is not very slow.

    chars = RandomChoice[{"A", "T", "C", "G"}, 800000];
    
    f[x_, "A"] := x/2;
    f[x_, "T"] := x/2 + {1/2, 0};
    f[x_, "G"] := x/2 + {1/2, 1/2};
    f[x_, "C"] := x/2 + {0, 1/2};
    Timing[pts = FoldList[f, {0.5, 0.5}, chars];]
    Graphics[{PointSize[Tiny], Point[pts]}]
    

    I get a timing of 6.8 seconds, which is usable unless you need to run it lots of times in a loop (if it’s not fast enough for your use case and machine, please add a comment, and we’ll try to speed it up).

    Rendering the graphic unfortunately takes much longer than this (36 seconds), and I don’t know if there’s anything you can do about it. Disabling antialiasing may help a little bit, depending on your platform, but not much: Style[Graphics[{PointSize[Tiny], Point[pts]}], Antialiasing -> False] (for me it doesn’t). This is a long-standing annoyance for many of us.

    Regarding the whole graphic being black, you can resize it using your mouse and make it bigger. The next time you evaluate your expression, the output graphic will remember its size. Or just use ImageSize -> 800 as a Graphics option. Considering the pixel density of screens the only other solution that I can think of (that doesn’t involve resizing the graphic) would be to represent pixel density using shades of grey, and plot the density.

    EDIT:

    This is how you can plot the density (this is also much much faster to compute and render than the point-plot!):

    With[{resolution = 0.01}, 
     ArrayPlot@BinCounts[pts, resolution, resolution]
    ]
    

    Play with the resolution to make the plot nice.

    For my random-sequence example, this only gives a grey plot. For your genome data it will probably give a more interesting pattern.

    EDIT 2:

    Here’s a simple way to speed up the function using compilation:

    First, replace the characters by the shift vectors (has to be done only once for a dataset, then you can save the result):

    arr = chars /. {"A" -> {0., 0.}, "T" -> {.5, 0.}, "G" -> {.5, .5}, "C" -> {0, .5}};
    

    Then let’s compile our function:

    fun = Compile[{{a, _Real, 2}}, FoldList[#/2 + #2 &, {.5, .5}, a], 
     CompilationTarget -> "C"]
    

    Remove CompilationTarget if your version of Mathematica is earlier than 8 or you don’t have a C compiler installed.

    fun[arr]; // Timing
    

    gives me 0.6 seconds, which is an instant 10x speedup.

    EDIT 3:

    Another ~5x speedup is possible compared to the above compiled version by avoiding some kernel callbacks in the compiled function (I checked the compilation output using CompilePrint to come up with this version — otherwise it’s not obvious why it’s faster):

    fun1d = Compile[{{a, _Real, 1}}, FoldList[#/2 + #2 &, .5, a], 
      CompilationTarget -> "C"]
    
    arrt = Transpose[arr];
    Timing[result = fun1d /@ arrt;]
    pts = Transpose[result];
    

    This runs in 0.11 seconds on my machine. On a more modern machine it should finish in a few seconds even for a 40 MB dataset.

    I split off the transpositions into separate inputs because at this point the running time of fun1d starts to get comparable to the running time of Transpose.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have tried this... Dim myMatches As String() = System.Text.RegularExpressions.Regex.Split(postRow.Item(Post), \b\#\b) But it is
I have written this piece of code that splits a string and stores it
I have to implement this algorithm in Mathematica: My problem is that I don't
this is the first time that I've posted but I tried to follow the
i have the following in mathematica and want to use it in matlab.I tried
I have this many hundreds of cell long Mathematica file and I want to
I have a Mathematica code where I have to evaluate numerically thousands of integrals
I have this expression in Mathematica: (a^2 (alpha + beta)^2)/(b^2 + c^2) + (a
Anyone who's tried to study mathematics using online resources will have come across these
I have tried <ul id=contact_list> <li id=phone>Local 604-555-5555</li> <li id=i18l_phone>Toll-Free 1-800-555-5555</li> </ul> with #contact_list

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.