Project [CASE WHEN (date#94159437 = null) THEN null ELSE cast(date#94159437 as date) END AS date#94159693, CASE WHEN ((overall#94159438 = NA) OR (overall#94159438 = null)) THEN null ELSE cast(overall#94159438 as int) END AS overall#94159748, CASE WHEN ((ret#94159439 = NA) OR (ret#94159439 = null)) THEN null ELSE cast(ret#94159439 as float) END AS ret#94159749, CASE WHEN ((resret#94159440 = NA) OR (resret#94159440 = null)) THEN null ELSE cast(resret#94159440 as float) END AS resret#94159750, CASE WHEN ((retnet#94159441 = NA) OR (retnet#94159441 = null)) THEN null ELSE cast(retnet#94159441 as float) END AS retnet#94159751, CASE WHEN ((turnover#94159442 = NA) OR (turnover#94159442 = null)) THEN null ELSE cast(turnover#94159442 as float) END AS turnover#94159752, CASE WHEN ((numcos#94159443 = NA) OR (numcos#94159443 = null)) THEN null ELSE cast(numcos#94159443 as float) END AS numcos#94159753, CASE WHEN ((benchmark#94159444 = NA) OR (benchmark#94159444 = null)) THEN null ELSE cast(benchmark#94159444 as float) END AS benchmark#94159754, CASE WHEN ((excess_ret#94159445 = NA) OR (excess_ret#94159445 = null)) THEN null ELSE cast(excess_ret#94159445 as float) END AS excess_ret#94159755, CASE WHEN ((excess_resret#94159446 = NA) OR (excess_resret#94159446 = null)) THEN null ELSE cast(excess_resret#94159446 as float) END AS excess_resret#94159756, CASE WHEN ((excess_retnet#94159447 = NA) OR (excess_retnet#94159447 = null)) THEN null ELSE cast(excess_retnet#94159447 as float) END AS excess_retnet#94159757]
Filter (isnotnull(cap#94159760) AND (cast(cap#94159760 as string) = 0))
WholeStageCodegen (1)
InMemoryTableScan [cap#94159760, date#94159694, numcos#94159852], [isnotnull(cap#94159760), (cast(cap#94159760 as string) = 0)]
Project [CASE WHEN (date#94159464 = null) THEN null ELSE cast(date#94159464 as date) END AS date#94159694, CASE WHEN ((cap#94159465 = NA) OR (cap#94159465 = null)) THEN null ELSE cast(cap#94159465 as float) END AS cap#94159760, CASE WHEN ((ret#94159466 = NA) OR (ret#94159466 = null)) THEN null ELSE cast(ret#94159466 as float) END AS ret#94159763, CASE WHEN ((resret#94159467 = NA) OR (resret#94159467 = null)) THEN null ELSE cast(resret#94159467 as float) END AS resret#94159789, CASE WHEN ((retnet#94159468 = NA) OR (retnet#94159468 = null)) THEN null ELSE cast(retnet#94159468 as float) END AS retnet#94159847, CASE WHEN ((turnover#94159469 = NA) OR (turnover#94159469 = null)) THEN null ELSE cast(turnover#94159469 as float) END AS turnover#94159850, CASE WHEN ((numcos#94159470 = NA) OR (numcos#94159470 = null)) THEN null ELSE cast(numcos#94159470 as float) END AS numcos#94159852, CASE WHEN ((coverage#94159471 = NA) OR (coverage#94159471 = null)) THEN null ELSE cast(coverage#94159471 as float) END AS coverage#94159854, CASE WHEN ((benchmark#94159472 = NA) OR (benchmark#94159472 = null)) THEN null ELSE cast(benchmark#94159472 as float) END AS benchmark#94159859, CASE WHEN ((excess_ret#94159473 = NA) OR (excess_ret#94159473 = null)) THEN null ELSE cast(excess_ret#94159473 as float) END AS excess_ret#94159862, CASE WHEN ((excess_resret#94159474 = NA) OR (excess_resret#94159474 = null)) THEN null ELSE cast(excess_resret#94159474 as float) END AS excess_resret#94159865, CASE WHEN ((excess_retnet#94159475 = NA) OR (excess_retnet#94159475 = null)) THEN null ELSE cast(excess_retnet#94159475 as float) END AS excess_retnet#94159867]
== Physical Plan ==
CollectLimit (16)
+- InMemoryTableScan (1)
+- InMemoryRelation (2)
+- * Sort (15)
+- Exchange (14)
+- Union (13)
:- InMemoryTableScan (3)
: +- InMemoryRelation (4)
: +- * Project (6)
: +- Scan csv (5)
+- * Project (12)
+- * Filter (11)
+- InMemoryTableScan (7)
+- InMemoryRelation (8)
+- * Project (10)
+- Scan csv (9)
(1) InMemoryTableScan
Output [2]: [date#94159693, numcos#94159753]
Arguments: [date#94159693, numcos#94159753]
(2) InMemoryRelation
Arguments: [date#94159693, numcos#94159753], CachedRDDBuilder(org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer@208e3fd9,StorageLevel(disk, memory, deserialized, 1 replicas),*(2) Sort [date#94159693 ASC NULLS FIRST], true, 0
+- Exchange rangepartitioning(date#94159693 ASC NULLS FIRST, 200), ENSURE_REQUIREMENTS, [id=#7517816]
+- Union
:- InMemoryTableScan [date#94159693, numcos#94159753]
: +- InMemoryRelation [date#94159693, overall#94159748, ret#94159749, resret#94159750, retnet#94159751, turnover#94159752, numcos#94159753, benchmark#94159754, excess_ret#94159755, excess_resret#94159756, excess_retnet#94159757], StorageLevel(disk, memory, deserialized, 1 replicas)
: +- *(1) Project [CASE WHEN (date#94159437 = null) THEN null ELSE cast(date#94159437 as date) END AS date#94159693, CASE WHEN ((overall#94159438 = NA) OR (overall#94159438 = null)) THEN null ELSE cast(overall#94159438 as int) END AS overall#94159748, CASE WHEN ((ret#94159439 = NA) OR (ret#94159439 = null)) THEN null ELSE cast(ret#94159439 as float) END AS ret#94159749, CASE WHEN ((resret#94159440 = NA) OR (resret#94159440 = null)) THEN null ELSE cast(resret#94159440 as float) END AS resret#94159750, CASE WHEN ((retnet#94159441 = NA) OR (retnet#94159441 = null)) THEN null ELSE cast(retnet#94159441 as float) END AS retnet#94159751, CASE WHEN ((turnover#94159442 = NA) OR (turnover#94159442 = null)) THEN null ELSE cast(turnover#94159442 as float) END AS turnover#94159752, CASE WHEN ((numcos#94159443 = NA) OR (numcos#94159443 = null)) THEN null ELSE cast(numcos#94159443 as float) END AS numcos#94159753, CASE WHEN ((benchmark#94159444 = NA) OR (benchmark#94159444 = null)) THEN null ELSE cast(benchmark#94159444 as float) END AS benchmark#94159754, CASE WHEN ((excess_ret#94159445 = NA) OR (excess_ret#94159445 = null)) THEN null ELSE cast(excess_ret#94159445 as float) END AS excess_ret#94159755, CASE WHEN ((excess_resret#94159446 = NA) OR (excess_resret#94159446 = null)) THEN null ELSE cast(excess_resret#94159446 as float) END AS excess_resret#94159756, CASE WHEN ((excess_retnet#94159447 = NA) OR (excess_retnet#94159447 = null)) THEN null ELSE cast(excess_retnet#94159447 as float) END AS excess_retnet#94159757]
: +- FileScan csv [date#94159437,overall#94159438,ret#94159439,resret#94159440,retnet#94159441,turnover#94159442,numcos#94159443,benchmark#94159444,excess_ret#94159445,excess_resret#94159446,excess_retnet#94159447] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/output/risk_factors/value/lon..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<date:string,overall:string,ret:string,resret:string,retnet:string,turnover:string,numcos:s...
+- *(1) Project [date#94159694, numcos#94159852]
+- *(1) Filter (isnotnull(cap#94159760) AND (cast(cap#94159760 as string) = 0))
+- InMemoryTableScan [cap#94159760, date#94159694, numcos#94159852], [isnotnull(cap#94159760), (cast(cap#94159760 as string) = 0)]
+- InMemoryRelation [date#94159694, cap#94159760, ret#94159763, resret#94159789, retnet#94159847, turnover#94159850, numcos#94159852, coverage#94159854, benchmark#94159859, excess_ret#94159862, excess_resret#94159865, excess_retnet#94159867], StorageLevel(disk, memory, deserialized, 1 replicas)
+- *(1) Project [CASE WHEN (date#94159464 = null) THEN null ELSE cast(date#94159464 as date) END AS date#94159694, CASE WHEN ((cap#94159465 = NA) OR (cap#94159465 = null)) THEN null ELSE cast(cap#94159465 as float) END AS cap#94159760, CASE WHEN ((ret#94159466 = NA) OR (ret#94159466 = null)) THEN null ELSE cast(ret#94159466 as float) END AS ret#94159763, CASE WHEN ((resret#94159467 = NA) OR (resret#94159467 = null)) THEN null ELSE cast(resret#94159467 as float) END AS resret#94159789, CASE WHEN ((retnet#94159468 = NA) OR (retnet#94159468 = null)) THEN null ELSE cast(retnet#94159468 as float) END AS retnet#94159847, CASE WHEN ((turnover#94159469 = NA) OR (turnover#94159469 = null)) THEN null ELSE cast(turnover#94159469 as float) END AS turnover#94159850, CASE WHEN ((numcos#94159470 = NA) OR (numcos#94159470 = null)) THEN null ELSE cast(numcos#94159470 as float) END AS numcos#94159852, CASE WHEN ((coverage#94159471 = NA) OR (coverage#94159471 = null)) THEN null ELSE cast(coverage#94159471 as float) END AS coverage#94159854, CASE WHEN ((benchmark#94159472 = NA) OR (benchmark#94159472 = null)) THEN null ELSE cast(benchmark#94159472 as float) END AS benchmark#94159859, CASE WHEN ((excess_ret#94159473 = NA) OR (excess_ret#94159473 = null)) THEN null ELSE cast(excess_ret#94159473 as float) END AS excess_ret#94159862, CASE WHEN ((excess_resret#94159474 = NA) OR (excess_resret#94159474 = null)) THEN null ELSE cast(excess_resret#94159474 as float) END AS excess_resret#94159865, CASE WHEN ((excess_retnet#94159475 = NA) OR (excess_retnet#94159475 = null)) THEN null ELSE cast(excess_retnet#94159475 as float) END AS excess_retnet#94159867]
+- FileScan csv [date#94159464,cap#94159465,ret#94159466,resret#94159467,retnet#94159468,turnover#94159469,numcos#94159470,coverage#94159471,benchmark#94159472,excess_ret#94159473,excess_resret#94159474,excess_retnet#94159475] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/output/risk_factors/value/lon..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<date:string,cap:string,ret:string,resret:string,retnet:string,turnover:string,numcos:strin...
,None), [date#94159693 ASC NULLS FIRST]
(3) InMemoryTableScan
Output [2]: [date#94159693, numcos#94159753]
Arguments: [date#94159693, numcos#94159753]
(4) InMemoryRelation
Arguments: [date#94159693, overall#94159748, ret#94159749, resret#94159750, retnet#94159751, turnover#94159752, numcos#94159753, benchmark#94159754, excess_ret#94159755, excess_resret#94159756, excess_retnet#94159757], CachedRDDBuilder(org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer@208e3fd9,StorageLevel(disk, memory, deserialized, 1 replicas),*(1) Project [CASE WHEN (date#94159437 = null) THEN null ELSE cast(date#94159437 as date) END AS date#94159693, CASE WHEN ((overall#94159438 = NA) OR (overall#94159438 = null)) THEN null ELSE cast(overall#94159438 as int) END AS overall#94159748, CASE WHEN ((ret#94159439 = NA) OR (ret#94159439 = null)) THEN null ELSE cast(ret#94159439 as float) END AS ret#94159749, CASE WHEN ((resret#94159440 = NA) OR (resret#94159440 = null)) THEN null ELSE cast(resret#94159440 as float) END AS resret#94159750, CASE WHEN ((retnet#94159441 = NA) OR (retnet#94159441 = null)) THEN null ELSE cast(retnet#94159441 as float) END AS retnet#94159751, CASE WHEN ((turnover#94159442 = NA) OR (turnover#94159442 = null)) THEN null ELSE cast(turnover#94159442 as float) END AS turnover#94159752, CASE WHEN ((numcos#94159443 = NA) OR (numcos#94159443 = null)) THEN null ELSE cast(numcos#94159443 as float) END AS numcos#94159753, CASE WHEN ((benchmark#94159444 = NA) OR (benchmark#94159444 = null)) THEN null ELSE cast(benchmark#94159444 as float) END AS benchmark#94159754, CASE WHEN ((excess_ret#94159445 = NA) OR (excess_ret#94159445 = null)) THEN null ELSE cast(excess_ret#94159445 as float) END AS excess_ret#94159755, CASE WHEN ((excess_resret#94159446 = NA) OR (excess_resret#94159446 = null)) THEN null ELSE cast(excess_resret#94159446 as float) END AS excess_resret#94159756, CASE WHEN ((excess_retnet#94159447 = NA) OR (excess_retnet#94159447 = null)) THEN null ELSE cast(excess_retnet#94159447 as float) END AS excess_retnet#94159757]
+- FileScan csv [date#94159437,overall#94159438,ret#94159439,resret#94159440,retnet#94159441,turnover#94159442,numcos#94159443,benchmark#94159444,excess_ret#94159445,excess_resret#94159446,excess_retnet#94159447] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/output/risk_factors/value/lon..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<date:string,overall:string,ret:string,resret:string,retnet:string,turnover:string,numcos:s...
,None)
(5) Scan csv
Output [11]: [date#94159437, overall#94159438, ret#94159439, resret#94159440, retnet#94159441, turnover#94159442, numcos#94159443, benchmark#94159444, excess_ret#94159445, excess_resret#94159446, excess_retnet#94159447]
Batched: false
Location: InMemoryFileIndex [file:/srv/plusamp/data/default/ea-market/output/risk_factors/value/longshort_overall.csv]
ReadSchema: struct<date:string,overall:string,ret:string,resret:string,retnet:string,turnover:string,numcos:string,benchmark:string,excess_ret:string,excess_resret:string,excess_retnet:string>
(6) Project [codegen id : 1]
Output [11]: [CASE WHEN (date#94159437 = null) THEN null ELSE cast(date#94159437 as date) END AS date#94159693, CASE WHEN ((overall#94159438 = NA) OR (overall#94159438 = null)) THEN null ELSE cast(overall#94159438 as int) END AS overall#94159748, CASE WHEN ((ret#94159439 = NA) OR (ret#94159439 = null)) THEN null ELSE cast(ret#94159439 as float) END AS ret#94159749, CASE WHEN ((resret#94159440 = NA) OR (resret#94159440 = null)) THEN null ELSE cast(resret#94159440 as float) END AS resret#94159750, CASE WHEN ((retnet#94159441 = NA) OR (retnet#94159441 = null)) THEN null ELSE cast(retnet#94159441 as float) END AS retnet#94159751, CASE WHEN ((turnover#94159442 = NA) OR (turnover#94159442 = null)) THEN null ELSE cast(turnover#94159442 as float) END AS turnover#94159752, CASE WHEN ((numcos#94159443 = NA) OR (numcos#94159443 = null)) THEN null ELSE cast(numcos#94159443 as float) END AS numcos#94159753, CASE WHEN ((benchmark#94159444 = NA) OR (benchmark#94159444 = null)) THEN null ELSE cast(benchmark#94159444 as float) END AS benchmark#94159754, CASE WHEN ((excess_ret#94159445 = NA) OR (excess_ret#94159445 = null)) THEN null ELSE cast(excess_ret#94159445 as float) END AS excess_ret#94159755, CASE WHEN ((excess_resret#94159446 = NA) OR (excess_resret#94159446 = null)) THEN null ELSE cast(excess_resret#94159446 as float) END AS excess_resret#94159756, CASE WHEN ((excess_retnet#94159447 = NA) OR (excess_retnet#94159447 = null)) THEN null ELSE cast(excess_retnet#94159447 as float) END AS excess_retnet#94159757]
Input [11]: [date#94159437, overall#94159438, ret#94159439, resret#94159440, retnet#94159441, turnover#94159442, numcos#94159443, benchmark#94159444, excess_ret#94159445, excess_resret#94159446, excess_retnet#94159447]
(7) InMemoryTableScan
Output [3]: [cap#94159760, date#94159694, numcos#94159852]
Arguments: [cap#94159760, date#94159694, numcos#94159852], [isnotnull(cap#94159760), (cast(cap#94159760 as string) = 0)]
(8) InMemoryRelation
Arguments: [date#94159694, cap#94159760, ret#94159763, resret#94159789, retnet#94159847, turnover#94159850, numcos#94159852, coverage#94159854, benchmark#94159859, excess_ret#94159862, excess_resret#94159865, excess_retnet#94159867], CachedRDDBuilder(org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer@208e3fd9,StorageLevel(disk, memory, deserialized, 1 replicas),*(1) Project [CASE WHEN (date#94159464 = null) THEN null ELSE cast(date#94159464 as date) END AS date#94159694, CASE WHEN ((cap#94159465 = NA) OR (cap#94159465 = null)) THEN null ELSE cast(cap#94159465 as float) END AS cap#94159760, CASE WHEN ((ret#94159466 = NA) OR (ret#94159466 = null)) THEN null ELSE cast(ret#94159466 as float) END AS ret#94159763, CASE WHEN ((resret#94159467 = NA) OR (resret#94159467 = null)) THEN null ELSE cast(resret#94159467 as float) END AS resret#94159789, CASE WHEN ((retnet#94159468 = NA) OR (retnet#94159468 = null)) THEN null ELSE cast(retnet#94159468 as float) END AS retnet#94159847, CASE WHEN ((turnover#94159469 = NA) OR (turnover#94159469 = null)) THEN null ELSE cast(turnover#94159469 as float) END AS turnover#94159850, CASE WHEN ((numcos#94159470 = NA) OR (numcos#94159470 = null)) THEN null ELSE cast(numcos#94159470 as float) END AS numcos#94159852, CASE WHEN ((coverage#94159471 = NA) OR (coverage#94159471 = null)) THEN null ELSE cast(coverage#94159471 as float) END AS coverage#94159854, CASE WHEN ((benchmark#94159472 = NA) OR (benchmark#94159472 = null)) THEN null ELSE cast(benchmark#94159472 as float) END AS benchmark#94159859, CASE WHEN ((excess_ret#94159473 = NA) OR (excess_ret#94159473 = null)) THEN null ELSE cast(excess_ret#94159473 as float) END AS excess_ret#94159862, CASE WHEN ((excess_resret#94159474 = NA) OR (excess_resret#94159474 = null)) THEN null ELSE cast(excess_resret#94159474 as float) END AS excess_resret#94159865, CASE WHEN ((excess_retnet#94159475 = NA) OR (excess_retnet#94159475 = null)) THEN null ELSE cast(excess_retnet#94159475 as float) END AS excess_retnet#94159867]
+- FileScan csv [date#94159464,cap#94159465,ret#94159466,resret#94159467,retnet#94159468,turnover#94159469,numcos#94159470,coverage#94159471,benchmark#94159472,excess_ret#94159473,excess_resret#94159474,excess_retnet#94159475] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/output/risk_factors/value/lon..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<date:string,cap:string,ret:string,resret:string,retnet:string,turnover:string,numcos:strin...
,None)
(9) Scan csv
Output [12]: [date#94159464, cap#94159465, ret#94159466, resret#94159467, retnet#94159468, turnover#94159469, numcos#94159470, coverage#94159471, benchmark#94159472, excess_ret#94159473, excess_resret#94159474, excess_retnet#94159475]
Batched: false
Location: InMemoryFileIndex [file:/srv/plusamp/data/default/ea-market/output/risk_factors/value/longshort_cap.csv]
ReadSchema: struct<date:string,cap:string,ret:string,resret:string,retnet:string,turnover:string,numcos:string,coverage:string,benchmark:string,excess_ret:string,excess_resret:string,excess_retnet:string>
(10) Project [codegen id : 1]
Output [12]: [CASE WHEN (date#94159464 = null) THEN null ELSE cast(date#94159464 as date) END AS date#94159694, CASE WHEN ((cap#94159465 = NA) OR (cap#94159465 = null)) THEN null ELSE cast(cap#94159465 as float) END AS cap#94159760, CASE WHEN ((ret#94159466 = NA) OR (ret#94159466 = null)) THEN null ELSE cast(ret#94159466 as float) END AS ret#94159763, CASE WHEN ((resret#94159467 = NA) OR (resret#94159467 = null)) THEN null ELSE cast(resret#94159467 as float) END AS resret#94159789, CASE WHEN ((retnet#94159468 = NA) OR (retnet#94159468 = null)) THEN null ELSE cast(retnet#94159468 as float) END AS retnet#94159847, CASE WHEN ((turnover#94159469 = NA) OR (turnover#94159469 = null)) THEN null ELSE cast(turnover#94159469 as float) END AS turnover#94159850, CASE WHEN ((numcos#94159470 = NA) OR (numcos#94159470 = null)) THEN null ELSE cast(numcos#94159470 as float) END AS numcos#94159852, CASE WHEN ((coverage#94159471 = NA) OR (coverage#94159471 = null)) THEN null ELSE cast(coverage#94159471 as float) END AS coverage#94159854, CASE WHEN ((benchmark#94159472 = NA) OR (benchmark#94159472 = null)) THEN null ELSE cast(benchmark#94159472 as float) END AS benchmark#94159859, CASE WHEN ((excess_ret#94159473 = NA) OR (excess_ret#94159473 = null)) THEN null ELSE cast(excess_ret#94159473 as float) END AS excess_ret#94159862, CASE WHEN ((excess_resret#94159474 = NA) OR (excess_resret#94159474 = null)) THEN null ELSE cast(excess_resret#94159474 as float) END AS excess_resret#94159865, CASE WHEN ((excess_retnet#94159475 = NA) OR (excess_retnet#94159475 = null)) THEN null ELSE cast(excess_retnet#94159475 as float) END AS excess_retnet#94159867]
Input [12]: [date#94159464, cap#94159465, ret#94159466, resret#94159467, retnet#94159468, turnover#94159469, numcos#94159470, coverage#94159471, benchmark#94159472, excess_ret#94159473, excess_resret#94159474, excess_retnet#94159475]
(11) Filter [codegen id : 1]
Input [3]: [cap#94159760, date#94159694, numcos#94159852]
Condition : (isnotnull(cap#94159760) AND (cast(cap#94159760 as string) = 0))
(12) Project [codegen id : 1]
Output [2]: [date#94159694, numcos#94159852]
Input [3]: [cap#94159760, date#94159694, numcos#94159852]
(13) Union
(14) Exchange
Input [2]: [date#94159693, numcos#94159753]
Arguments: rangepartitioning(date#94159693 ASC NULLS FIRST, 200), ENSURE_REQUIREMENTS, [id=#7517816]
(15) Sort [codegen id : 2]
Input [2]: [date#94159693, numcos#94159753]
Arguments: [date#94159693 ASC NULLS FIRST], true, 0
(16) CollectLimit
Input [2]: [date#94159693, numcos#94159753]
Arguments: 1000000