f# - Return multiple columns / a dataframe in Deedle based on row-wise mapping -


i want @ each row in frame , construct multiple columns new frame based on values in row.

the final result should frame has columns of original frame plus new columns.

i have solution wonder if there better one. think best way explain desired behavior example. i'm using deedle's titanic data set:

#r @"f:\aolney\research_projects\braintrust\code\qualtricstor\packages\deedle.1.2.3\lib\net40\deedle.dll";; #r @"f:\aolney\research_projects\braintrust\code\qualtricstor\packages\fsharp.charting.0.90.12\lib\net40\fsharp.charting.dll";; #r @"f:\aolney\research_projects\braintrust\code\qualtricstor\packages\fsharp.data.2.2.2\lib\net40\fsharp.data.dll";; open system open fsharp.data open deedle open fsharp.charting;; #load @"f:\aolney\research_projects\braintrust\code\qualtricstor\packages\fsharp.charting.0.90.12\fsharp.charting.fsx";; #load @"f:\aolney\research_projects\braintrust\code\qualtricstor\packages\deedle.1.2.3\deedle.fsx";;  let titanic = frame.readcsv(@"c:\users\aolne_000\downloads\titanic.csv");; 

this frame looks like:

val titanic : frame<int,string> =         passengerid survived pclass name                                                sex    age       sibsp parch ticket           fare    cabin embarked  0   -> 1           false    3      braund, mr. owen harris                             male   22        1     0     a/5 21171        7.25          s         1   -> 2           true     1      cumings, mrs. john bradley (florence briggs thayer) female 38        1     0     pc 17599         71.2833 c85   c         

my approach grabs each row, uses selection logic, , returns new row value as dictionary. use deedle's expansion operation convert values in dictionary new columns.

titanic?test <- titanic |> frame.maprowvalues( fun x -> if x.getas<int>("pclass") > 1 dict ["a", 1; "b", 2] else dict ["a", 2 ; "b", 1] );; titanic |> frame.expandcols ["test"];; 

this gives following new frame:

       passengerid survived pclass name                                                sex    age       sibsp parch ticket           fare    cabin embarked test.a test.b  0   -> 1           false    3      braund, mr. owen harris                             male   22        1     0     a/5 21171        7.25          s        1      2       1   -> 2           true     1      cumings, mrs. john bradley (florence briggs thayer) female 38        1     0     pc 17599         71.2833 c85   c        2      1       

note last 2 columns test.a , test.b. approach creates new frame (a , b) , joins frame existing frame.

this fine use case confusing others read. forces prefix, e.g. "test", on final columns isn't highly desirable.

is there way append new values end of row series represented in code above x?

i find approach quite elegant , clever. because new series shares index original frame, going pretty fast. so, think solution may better alternative option (but have not measured this).

anyway, other option return new rows frame.maprowvalues call - each row, return original row additional columns.

titanic  |> frame.maprowvalues(fun x ->    let add =       if x.getas<int>("pclass") > 1 series ["a", box 1; "b", box 2]      else series ["a", box 2 ; "b", box 1]   series.merge x add) |> frame.ofrows 

Comments

Popular posts from this blog

javascript - Karma not able to start PhantomJS on Windows - Error: spawn UNKNOWN -

Nuget pack csproj using nuspec -

c# - Display ASPX Popup control in RowDeleteing Event (ASPX Gridview) -