• Posted by Konstantin 13.10.2012

    I have recently discovered that simple Venn diagrams are surprisingly popular in bioinformatics. So popular they are, in fact, that there are several bioinformatics research papers devoted solely to their use. And those are highly accessed papers, let me add! Yet, despite this wild popularity, tools that let you render a decent Venn diagram programmatically seem to be rather scarce.

    Vennerable plot

    Vennerable plot

    If you google a bit, you will find a bunch of on-line tools of varying degrees of quality and ability (1, 2, 3, 4, 5, 6, 7, 8, 9,...), a Java-based tool,  a Perl library, a couple of Python scripts (1, 2), some R libraries (1, 2, 3, 4, 5), and lots of forum discussions. Seems to be plenty, doesn't it? Well, it turns out that if you want your diagram to be area-weighted (i.e. the regions of the diagram should be roughly proportional to the corresponding set sizes), 4 of those 18 links won't do. If you want to generate and configure the diagram conveniently from a script, drop another 9. Then, if you want the diagram to look nice, drop 4 more, and all you are left with is the Vennerable R package. Unfortunately, Vennerable plots are still a pain to configure — even adding a plot title seems to be very tricky, not speaking of highlighting and annotating a region on the diagram.

    Having been totally disappointed in the state of the art of contemporary Venn-diagramming tools, I made a small Python package for drawing Venn diagrams that has the necessary flexibility. At least it lets me put plot titles and annotate diagram regions as I fancy.

     

    Matplotlib-venn plot

    Matplotlib-venn plot

     

    Package installation goes by the standard method: easy_install matplotlib-venn

    For basic usage examples, consult the PyPI page.

    Posted by Konstantin @ 2:05 am

    Tags: , , , ,

  • 26 Comments

    1. Hjalmar on 20.10.2012 at 17:06 (Reply)

      First, thanks for writing this code. It is exactly what I need.
      Unfortunately, I cannot run it. It appears to interfere with matplotlib.

      [... long stacktrace truncated as not relevant any more ...]

      Do you have advice?

      best,
      Hjalmar

      1. Konstantin on 20.10.2012 at 17:21 (Reply)

        Eh, I was suspecting that this whole “namespace package” business won’t work as expected for at least someone (i.e. distributing a package matplotlib.venn from a separate egg).

        That means I need to rename the package to something non-conflicting (e.g. matplotlibvenn instead of matplotlib.venn or the like). I’ll fix it today/tomorrow. If you want to try things before, just download the source, rename the matplotlib directory there to something different, edit namespace_packages option in setup.py, and either do python setup.py install or just run python from that directory and do “import whateveryournewname.venn”, etc.

    2. Hjalmar on 20.10.2012 at 18:35 (Reply)

      I did a quick fix.
      Placed _venn2.py, _venn3.py and _math.py in my working dir and imported venn2 from _venn2.
      It worked fine.

      Thanks again for writing the code, it was very useful.

      1. Konstantin on 20.10.2012 at 23:56 (Reply)

        I fixed the problem now by renaming the package from matplotlib.venn to matplotlib_venn in the new version.

        Thanks for your feedback!

    3. Hjalmar on 21.10.2012 at 17:21 (Reply)

      It works fine using pip install.

      Thanks

    4. Eirinn on 28.11.2012 at 13:00 (Reply)

      Hi Konstantin,
      I really like your venn module – it’s exactly what I need and very easy to use. Thanks!

      There’s a bug in the colour-calculation code:

      venn2((3,2,1),set_colors=(‘blue’,'lightblue’))

      ValueError: to_rgba: Invalid rgba arg “[ 0.47490196 0.59294118 1.33137255]” number in rbg sequence outside 0-1 range

      Basically, it doesn’t handle colours that are too bright.

      1. Konstantin on 28.11.2012 at 16:29 (Reply)

        Wow, that was an unexpectedly stupid bug from my side.

        Fixed. Do
        pip install -U --no-deps matplotlib_venn
        (or “easy_install --upgrade matplotlib_venn“)
        and try again.

        Thanks!

    5. Konrad on 19.05.2013 at 21:17 (Reply)

      Many thanks for this very helpful library!

    6. tryptofame on 27.05.2013 at 18:53 (Reply)

      Dear Konstantin,
      thank you for the package.
      I’ve successfully used it, but am unhappy with the result, since the numbers for subsets/”sub-circles” are not placed correctly within the graph. I’ve tried playing around with the plt.figure(figsize=xyz)) and the “normalize_to” variable, but the results stay the same.
      Do you have any suggestions, tipps??? I would appreciate your help!

      Thank you for your effort and time!
      Best regards,
      T

      1. Konstantin on 27.05.2013 at 19:37 (Reply)

        Hello. You are free to reposition the elements of the resulting plot to your liking by accessing them directly (as matplotlib objects). Check out the answer to this issue. Note that the picture generated by matplotlib-venn has a width of 1 (i.e. increasing the x coordinate of a label by 0.5 should move it right by half the plot width).

    7. setty on 09.07.2013 at 20:20 (Reply)

      Hi

      Thanks for the awesome pkg! Is there any easy way of switching off the area weighting? The Venn I am trying to make has differences of over 10^4 so smaller areas become difficult to see….

      Thanks again!
      S

      1. Konstantin on 11.07.2013 at 19:37 (Reply)

        Hi,

        You can remove the weighing by specifying equal values for all subset sizes. You’ll then have to change subset labels to their actual values manually. I.e. something like:

        diagram = venn3(subsets=(1,1,1,1,1,1,1))
        diagram.get_label_by_id(’111′).set_text(12)
        diagram.get_label_by_id(’101′).set_text(31)

        I remember that I did not include something like a “weighted=False” parameter on purpose (but now I don’t remember why, though). I’ll add it in the next version.

    8. Joe on 30.08.2013 at 15:12 (Reply)

      Hi,

      Is it possible to make a circle within a circle? As in, display a set A which fully encompasses set B (which then intersects with a set C, as in a normal diagram? I can’t work out how to do this. Any help would be greatly appreciated.

      Thanks.

      1. Konstantin on 30.08.2013 at 15:31 (Reply)

        If your data indicate that a circle should be within a circle, that is what you should see, in general. E.g.

        venn3(subsets=(1, 0, 1, 1, 1, 0, 1))

        I.e. as the data here indicates, the areas of “010″ and “011″ are both zeros. Consequently, circle “B” must lie completely inside “A”.

        Admittedly, this configuration seems to produce somewhat unexpected colors (because the regions overlap in a way I did not account for) and some labels are located suboptimally. I’ll try to fix it in the next release. So far you may overcome it by tuning the transparency and the colors of certain regions manually, e.g.:

        v = venn3(subsets=(1, 0, 1, 1, 1, 0, 1))
        v.get_patch_by_id('110').set_color('red')
        v.get_patch_by_id('110').set_alpha(1)

        Note: there does not necessarily exist a perfect circle positioning, that will exactly correspond to all seven of your region areas. In particular, the current positioning algorithm only makes sure that the three pairwise circle intersections (AB, BC, AC) have correct areas. It should reflect the “circle within other circle” situation always correctly, though, I think.

        1. Joe on 30.08.2013 at 16:33 (Reply)

          Hi, thanks for your response. I still can’t get my desired venn diagram working though. I’ve drawn (badly) an example of what I want: http://i.imgur.com/Vk32xyQ.png

          I’ve tried using the following numbers:

          venn3(subsets=(25, 231, 8, 0, 15, 80, 15))

          which don’t seem to work – am I missing anything?

          Thanks again.

          1. Konstantin on 30.08.2013 at 17:12 (Reply)

            Firstly, your parameters are wrong. Namely, the size of the set “101″ must be 0 rather than 15, according to your image. So the correct call would be
            venn3(subsets=(25, 231, 8, 0, 0, 80, 15))

            Secondly, the current layout algorithm will not produce what you want. If you look at the result of the above call it is close to what you need except that one of the bubbles is “stuck” to the wall of another one. The layout algorithm is not smart enough to come up with an idea of aligning the bubbles for a slightly nicer look in this particular special case.

            If you really-really need the symmetrically aligned version the simplest way you could try is to draw a two-set diagram and superimpose a third circle on top, e.g.:

            # Draw a two-circle diagram
            
            venn2(subsets=(25, 231+65, 8+15))
            
            # Find out the location of the two circles
            # (you can look up how its done in the first lines
            # of the venn2 function)
            
            from matplotlib_venn._venn2 import compute_venn2_areas, \
                                        solve_venn2_circles
            subsets = (25, 231+65, 8+15)
            areas = compute_venn2_areas(subsets, normalize_to=1.0)
            centers, radii = solve_venn2_circles(areas)
            
            # Now draw the third circle.
            # Its area is (15+65)/(25+8+15) times
            # that of the first circle,
            # hence its radius must be 
            
            r3 = radii[0]*sqrt((15+65.0)/(25+8+15))
            
            # Its position must be such that the intersection
            # area  with C1 is  15/(15+8+25) of C1's area.
            # The way to compute the distance between
            # the circles by area can be looked up in
            # solve_venn2_circles
            
            from matplotlib_venn._math import find_distance_by_area
            distance = find_distance_by_area(radii[0], r3,
                        15.0/(15+8+25)*np.pi*radii[0]*radii[0])
            ax = gca()
            ax.add_patch(Circle(centers[0] + np.array([distance, 0]),
                         r3, alpha=0.5, edgecolor=None,
                         facecolor='red', linestyle=None,
                         linewidth=0))
            

            .. and then you’ll need to play around with the text labels a bit too.

            1. Joe on 30.08.2013 at 18:24 (Reply) (Comments won't nest below this level)

              Hi, thanks a lot for your help. I see now where I was going wrong with my numbers. I think I’d rather use the first method, but it looks like it might be a bit painful to get the labels in the right positions, and remove 0 etc. I’ll try look at the wiki and sort out the labels, and hopefully I can get something close to my hand-drawn version.

              Thanks again.

            2. Konstantin on 30.08.2013 at 18:54 (Reply)

              Try this as a starting point:

              v = venn3(subsets=(25, 231, 8, 0, 0, 80, 15))
              v.get_label_by_id('001').set_text('')
              v.get_label_by_id('101').set_text('')
              p = v.get_label_by_id('100').get_position()
              v.get_label_by_id('100').set_position([p[0]-0.1, p[1]])
              p = v.get_label_by_id('011').get_position()
              v.get_label_by_id('011').set_position([p[0]+0.3,
                                       p[1]+0.15])
              p = v.get_label_by_id('110').get_position()
              v.get_label_by_id('110').set_position([p[0],
                                       p[1]+0.05])
              
            3. Joe on 02.09.2013 at 15:08 (Reply)

              Wow! Thank you so much for your help – I can’t tell you how much time and pain this has saved me. You have really renewed my faith in the open source/developer community. Brilliant library and brilliant dev.

    9. TimothyAWiseman on 01.10.2013 at 22:10 (Reply)

      This is a fantastic package. Thank you for providing it. I am having a hard time in finding a way to set the background color. I would expect this to work:

      import matplotlib_venn as venn
      from matplotlib import pyplot as plt

      fig = plt.figure()
      ax = fig.add_subplot(111, axisbg = ‘green’)
      v = venn.venn2((1, 2, 1), ax = ax)
      plt.show()

      But it does not give me a green background. I have tried a number of variations but have not been able to get it to work. Any suggestions?

      1. Konstantin on 01.10.2013 at 23:23 (Reply)

        Hey, thanks for the feedback.

        Try this:

        v = venn2( (10, 20, 10), alpha = 1 )
        gca().set_axis_bgcolor('blue')
        gca().set_axis_on()

        Explanation: you do not see the axis background because somewhere within the package I do something like ax.set_axis_off(). This removes the axis rectangle completely. It seemed like a good thing to do for an average Venn diagram. You can switch it back on, by calling set_axis_on() on the axis object. Also, note that by default the Venn diagrams use colors with transparency, which look ugly with a non-white background. You can disable the default transparency setting by giving the “alpha” keyword parameter to the venn* function. This works because the function passes all the keyword arguments it does not know about directly to the Patch objects that make up the diagram.

        1. TimothyAWiseman on 02.10.2013 at 00:17 (Reply)

          That works. Thanks.

          Now that I’ve played with it, I agree that I will normally want the axis off most of that time. But it’s nice to know why it wasn’t working.

          The funny thing is that I looked through some of your source code to try to figure it out, and I completely missed that in the prepare_venn3_axes until you pointed it out.

          Thanks again.

    10. Juan on 05.11.2013 at 20:26 (Reply)

      is it possible to do more than 3 sets? What package would you recommend for n>3 of sets?

      1. Konstantin on 08.11.2013 at 02:56 (Reply)

        This particular package won’t do more than 3 sets. If you want an unweighed diagram with four sets, check out this script.

        An area-weighed diagram for more than three sets won’t make too much illustrative sense in most cases I believe, because it is way too underconstrained (i.e. it is only possible to faithfully depict perhaps 10 or so areas out of the total 15 in the diagram).

        However, if you really want to, you should probably try VennMaster.

        (And yes, both of those links are mentioned in the body of the blog post).

    11. Sébastien Brodeur on 13.02.2014 at 08:29 (Reply)

      A simple solution for title in Vennerable in R.

      v <- Venn(n=2)
      plot(v)
      grid.text("Title", vp = viewport(x=0.5, y=.9, w=unit(1, "npc"), h=unit(1, "npc")))

      1. Konstantin on 13.02.2014 at 17:43 (Reply)

        Nice! I wish this hint were in some place that is easy to google up. (Although this site does come up somewhere on the first page of a “vennerable plot title” query currently, I am not sure most people looking for this answer would find your remark here).

        Perhaps you could post it as a self-answered question on stackoverflow?

    Leave a comment

    Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.

     

Calendar

October 2012
M T W T F S S
« Sep   Nov »
1234567
891011121314
15161718192021
22232425262728
293031