[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Packaging Python Scripts with Supplementary R Package



Hello,

I am working on a software package which is primarily built on Python but also uses an R package I have designed with some necessary functionality.  The R portion of the software uses two packages which perform specialized tasks that are not available in python packages.  (Those packages are MSIseq, which designates tumors as microsatellite stable or instable based on mutation data, and deconstructSigs, which determines the major mutation signatures present in tumor samples.)  I hope to package this software together in such a way that it is easily installed on other systems, as I know I appreciate such conveniences when working with new software.  I understand that creating a package for the pip installer is standard practice for python-based software, but as far as I understand it, I won't be able to specify dependencies outside of other python modules, such as the R package needed to run the software and the r-base package itself.

I have explored a couple alternative options so far:  
First, I have tried using rpy2 to run the necessary R scripts from within python scripts but have encountered issues with installing the previously mentioned packages.  (I believe the issue had to do with one of those packages being dependent on rJava, which was not installing correctly through rpy2).  Currently, I am instead using the subprocess function in python to call the R scripts with arguments in the command line calls.  This has worked just fine so far.  

The second option I have explored is to package all the software, both the python package and the R package into a single Debian package so that are distributed as a unit.  So far, I have been successful in using stdeb to build and deploy the python portion of my project.  (I also had success using dh-python, although I eventually found that stdeb streamlined the build process significantly.)  However, I am having issues packaging the R portion of the project in a way that allows it to be installed directly from my ppa.  (This may venture outside of the scope of this mailing list, but my understanding is that my difficulties are stemming from the fact that MSIseq and deconstrucSigs are not R packages which are directly supported in the default Debian apt repositories.)

My goal for this project is to be able to package everything in such a way where a single command is all that is needed to install the package and get things running (e.g. "sudo apt install [my_package]")  Is this reasonable, considering the problems that I am encountering and the admittedly dubious decision I've made to hybridize python and R code into a single project?  Are there any solutions I have missed or documentation that you believe would be helpful?   Furthermore, this is the first time I have attempted a project of this scale, so I am open to criticism about the goals I have for distribution as well as the general architecture of the project itself.  I have been developing this software in a vacuum for far too long and would really appreciate assistance anyone is willing to give.

Thank you,
Ben Morledge-Hampton

Reply to: