== Physical Plan == CollectLimit (7) +- * ColumnarToRow (6) +- InMemoryTableScan (1) +- InMemoryRelation (2) +- * Project (5) +- * Filter (4) +- Scan csv (3) (1) InMemoryTableScan Output [2]: [turnover#94215853, days_hold#94215887] Arguments: [turnover#94215853, days_hold#94215887] (2) InMemoryRelation Arguments: [turnover#94215853, days_hold#94215887], CachedRDDBuilder(org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer@208e3fd9,StorageLevel(disk, memory, deserialized, 1 replicas),*(1) Project [CASE WHEN ((turnover#94215567 = NA) OR (turnover#94215567 = null)) THEN null ELSE cast(turnover#94215567 as float) END AS turnover#94215853, CASE WHEN ((turnover#94215567 = NA) OR (turnover#94215567 = null)) THEN null ELSE (1.0 / cast(cast(turnover#94215567 as float) as double)) END AS days_hold#94215887] +- *(1) Filter ((isnotnull(cap#94215535) AND NOT coalesce(((cap#94215535 = NA) OR (cap#94215535 = null)), false)) AND (cast(cap#94215535 as float) = 0.0)) +- FileScan csv [cap#94215535,turnover#94215567] Batched: false, DataFilters: [isnotnull(cap#94215535), NOT coalesce(((cap#94215535 = NA) OR (cap#94215535 = null)), false), (c..., Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/output/tm1/eatm1_score/stats_..., PartitionFilters: [], PushedFilters: [IsNotNull(cap)], ReadSchema: struct<cap:string,turnover:string> ,None) (3) Scan csv Output [2]: [cap#94215535, turnover#94215567] Batched: false Location: InMemoryFileIndex [file:/srv/plusamp/data/default/ea-market/output/tm1/eatm1_score/stats_overall.csv] PushedFilters: [IsNotNull(cap)] ReadSchema: struct<cap:string,turnover:string> (4) Filter [codegen id : 1] Input [2]: [cap#94215535, turnover#94215567] Condition : ((isnotnull(cap#94215535) AND NOT coalesce(((cap#94215535 = NA) OR (cap#94215535 = null)), false)) AND (cast(cap#94215535 as float) = 0.0)) (5) Project [codegen id : 1] Output [2]: [CASE WHEN ((turnover#94215567 = NA) OR (turnover#94215567 = null)) THEN null ELSE cast(turnover#94215567 as float) END AS turnover#94215853, CASE WHEN ((turnover#94215567 = NA) OR (turnover#94215567 = null)) THEN null ELSE (1.0 / cast(cast(turnover#94215567 as float) as double)) END AS days_hold#94215887] Input [2]: [cap#94215535, turnover#94215567] (6) ColumnarToRow [codegen id : 1] Input [2]: [turnover#94215853, days_hold#94215887] (7) CollectLimit Input [2]: [turnover#94215853, days_hold#94215887] Arguments: 1000000