pandas - allowing multiple inputs to python subprocess -


i have near-identical problem 1 asked several years ago : python subprocess 2 inputs received 1 answer no implemention. i'm hoping repost may clear things me , others.

as in above, use subprocess wrap command-line tool takes multiple inputs. in particular, want avoid writing input files disk, rather use e.g. named pipes, alluded in above. should read "learn how to" admittedly have never tried using named pipes before. i'll further state inputs have 2 pandas dataframes, , i'd 1 output.

the generic command-line implementation:

/usr/local/bin/my_command inputfilea.csv inputfileb.csv -o outputfile 

my current implementation, predictably, doesn't work. don't see how/when dataframes sent command process through named pipes, , i'd appreciate help!

import os import stringio import subprocess import pandas pd dfa = pd.dataframe([[1,2,3],[3,4,5]], columns=["a","b","c"]) dfb = pd.dataframe([[5,6,7],[6,7,8]], columns=["a","b","c"])   # make 2 fifos host dataframes fna = 'inputa'; os.mkfifo(fna); ffa = open(fna,"w") fnb = 'inputb'; os.mkfifo(fnb); ffb = open(fnb,"w")  # don't know if need make 2 subprocesses pipe inputs  ppa  = subprocess.popen("echo",                      stdin =subprocess.pipe,                     stdout=subprocess.pipe,                     stderr=subprocess.pipe) ppb  = subprocess.popen("echo",                      stdin = suprocess.pipe,                     stdout=subprocess.pipe,                     stderr=subprocess.pipe)  ppa.communicate(input = dfa.to_csv(header=false,index=false,sep="\t")) ppb.communicate(input = dfb.to_csv(header=false,index=false,sep="\t"))   pope = subprocess.popen(["/usr/local/bin/my_command",                         fna,fnb,"stdout"],                         stdout=subprocess.pipe,                         stderr=subprocess.pipe) (out,err) = pope.communicate()  try:     out = pd.read_csv(stringio.stringio(out), header=none,sep="\t") except valueerror: # fail     out = ""     print("\n###command failed###\n")  os.unlink(fna); os.remove(fna) os.unlink(fnb); os.remove(fnb) 

you don't need additional processes pass data child process without writing disk:

#!/usr/bin/env python import os import shutil import subprocess import tempfile import threading contextlib import contextmanager     import pandas pd  @contextmanager def named_pipes(count):     dirname = tempfile.mkdtemp()     try:         paths = []         in range(count):             paths.append(os.path.join(dirname, 'named_pipe' + str(i)))             os.mkfifo(paths[-1])         yield paths     finally:         shutil.rmtree(dirname)  def write_command_input(df, path):     df.to_csv(path, header=false,index=false, sep="\t")  dfa = pd.dataframe([[1,2,3],[3,4,5]], columns=["a","b","c"]) dfb = pd.dataframe([[5,6,7],[6,7,8]], columns=["a","b","c"])  named_pipes(2) paths:     p = subprocess.popen(["cat"] + paths, stdout=subprocess.pipe)     p.stdout:         df, path in zip([dfa, dfb], paths):             t = threading.thread(target=write_command_input, args=[df, path])              t.daemon = true             t.start()         result = pd.read_csv(p.stdout, header=none, sep="\t") p.wait() 

cat used demonstration. should use command instead ("/usr/local/bin/my_command"). assume can't pass data using standard input , have pass input via files. result read subprocess' standard output.


Comments

Popular posts from this blog

javascript - Karma not able to start PhantomJS on Windows - Error: spawn UNKNOWN -

c# - Display ASPX Popup control in RowDeleteing Event (ASPX Gridview) -

Nuget pack csproj using nuspec -