== Physical Plan == CollectLimit (20) +- InMemoryTableScan (1) +- InMemoryRelation (2) +- * Sort (19) +- Exchange (18) +- * Project (17) +- * BroadcastHashJoin Inner BuildLeft (16) :- BroadcastExchange (10) : +- * Project (9) : +- * Filter (8) : +- * ColumnarToRow (7) : +- InMemoryTableScan (3) : +- InMemoryRelation (4) : +- * Project (6) : +- Scan csv (5) +- * Filter (15) +- InMemoryTableScan (11) +- InMemoryRelation (12) +- * Project (14) +- Scan csv (13) (1) InMemoryTableScan Output [7]: [sector_id#93996102, numcos#93996112, numdates#93996115, sort#93880530, description#93880532, universe#93996447, coverage#93996359] Arguments: [sector_id#93996102, numcos#93996112, numdates#93996115, sort#93880530, description#93880532, universe#93996447, coverage#93996359] (2) InMemoryRelation Arguments: [sector_id#93996102, numcos#93996112, numdates#93996115, sort#93880530, description#93880532, universe#93996447, coverage#93996359], CachedRDDBuilder(org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer@208e3fd9,StorageLevel(disk, memory, deserialized, 1 replicas),*(3) Sort [sort#93880530 ASC NULLS FIRST, description#93880532 ASC NULLS FIRST], true, 0 +- Exchange rangepartitioning(sort#93880530 ASC NULLS FIRST, description#93880532 ASC NULLS FIRST, 200), ENSURE_REQUIREMENTS, [id=#7504898] +- *(2) Project [sector_id#93996102, numcos#93996112, numdates#93996115, sort#93880530, description#93880532, universe#93996447, coverage#93996359] +- *(2) BroadcastHashJoin [sector_id#93996102], [sector_id#93880529], Inner, BuildLeft, false :- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#7504891] : +- *(1) Project [sector_id#93996102, numcos#93996112, numdates#93996115, coverage#93996359, round((cast(numcos#93996112 as double) / cast(coverage#93996359 as double)), 0) AS universe#93996447] : +- *(1) Filter isnotnull(sector_id#93996102) : +- *(1) ColumnarToRow : +- InMemoryTableScan [coverage#93996359, numcos#93996112, numdates#93996115, sector_id#93996102], [isnotnull(sector_id#93996102)] : +- InMemoryRelation [sector_id#93996102, retIC#93996106, resretIC#93996109, numcos#93996112, numdates#93996115, annual_bmret#93996127, annual_ret#93996128, std_ret#93996129, Sharpe_ret#93996130, PctPos_ret#93996131, TR_ret#93996132, IR_ret#93996133, annual_resret#93996134, std_resret#93996135, Sharpe_resret#93996136, PctPos_resret#93996139, TR_resret#93996141, IR_resret#93996143, annual_retnet#93996200, std_retnet#93996202, Sharpe_retnet#93996354, PctPos_retnet#93996355, TR_retnet#93996356, IR_retnet#93996357, ... 2 more fields], StorageLevel(disk, memory, deserialized, 1 replicas) : +- *(1) Project [CASE WHEN ((sector_id#93995818 = NA) OR (sector_id#93995818 = null)) THEN null ELSE cast(sector_id#93995818 as int) END AS sector_id#93996102, CASE WHEN ((retIC#93995819 = NA) OR (retIC#93995819 = null)) THEN null ELSE cast(retIC#93995819 as float) END AS retIC#93996106, CASE WHEN ((resretIC#93995820 = NA) OR (resretIC#93995820 = null)) THEN null ELSE cast(resretIC#93995820 as float) END AS resretIC#93996109, CASE WHEN ((numcos#93995821 = NA) OR (numcos#93995821 = null)) THEN null ELSE cast(numcos#93995821 as float) END AS numcos#93996112, CASE WHEN ((numdates#93995822 = NA) OR (numdates#93995822 = null)) THEN null ELSE cast(numdates#93995822 as float) END AS numdates#93996115, CASE WHEN ((annual_bmret#93995823 = NA) OR (annual_bmret#93995823 = null)) THEN null ELSE cast(annual_bmret#93995823 as float) END AS annual_bmret#93996127, CASE WHEN ((annual_ret#93995824 = NA) OR (annual_ret#93995824 = null)) THEN null ELSE cast(annual_ret#93995824 as float) END AS annual_ret#93996128, CASE WHEN ((std_ret#93995825 = NA) OR (std_ret#93995825 = null)) THEN null ELSE cast(std_ret#93995825 as float) END AS std_ret#93996129, CASE WHEN ((Sharpe_ret#93995826 = NA) OR (Sharpe_ret#93995826 = null)) THEN null ELSE cast(Sharpe_ret#93995826 as float) END AS Sharpe_ret#93996130, CASE WHEN ((PctPos_ret#93995827 = NA) OR (PctPos_ret#93995827 = null)) THEN null ELSE cast(PctPos_ret#93995827 as float) END AS PctPos_ret#93996131, CASE WHEN ((TR_ret#93995828 = NA) OR (TR_ret#93995828 = null)) THEN null ELSE cast(TR_ret#93995828 as float) END AS TR_ret#93996132, CASE WHEN ((IR_ret#93995829 = NA) OR (IR_ret#93995829 = null)) THEN null ELSE cast(IR_ret#93995829 as float) END AS IR_ret#93996133, CASE WHEN ((annual_resret#93995830 = NA) OR (annual_resret#93995830 = null)) THEN null ELSE cast(annual_resret#93995830 as float) END AS annual_resret#93996134, CASE WHEN ((std_resret#93995831 = NA) OR (std_resret#93995831 = null)) THEN null ELSE cast(std_resret#93995831 as float) END AS std_resret#93996135, CASE WHEN ((Sharpe_resret#93995832 = NA) OR (Sharpe_resret#93995832 = null)) THEN null ELSE cast(Sharpe_resret#93995832 as float) END AS Sharpe_resret#93996136, CASE WHEN ((PctPos_resret#93995833 = NA) OR (PctPos_resret#93995833 = null)) THEN null ELSE cast(PctPos_resret#93995833 as float) END AS PctPos_resret#93996139, CASE WHEN ((TR_resret#93995834 = NA) OR (TR_resret#93995834 = null)) THEN null ELSE cast(TR_resret#93995834 as float) END AS TR_resret#93996141, CASE WHEN ((IR_resret#93995835 = NA) OR (IR_resret#93995835 = null)) THEN null ELSE cast(IR_resret#93995835 as float) END AS IR_resret#93996143, CASE WHEN ((annual_retnet#93995836 = NA) OR (annual_retnet#93995836 = null)) THEN null ELSE cast(annual_retnet#93995836 as float) END AS annual_retnet#93996200, CASE WHEN ((std_retnet#93995837 = NA) OR (std_retnet#93995837 = null)) THEN null ELSE cast(std_retnet#93995837 as float) END AS std_retnet#93996202, CASE WHEN ((Sharpe_retnet#93995838 = NA) OR (Sharpe_retnet#93995838 = null)) THEN null ELSE cast(Sharpe_retnet#93995838 as float) END AS Sharpe_retnet#93996354, CASE WHEN ((PctPos_retnet#93995839 = NA) OR (PctPos_retnet#93995839 = null)) THEN null ELSE cast(PctPos_retnet#93995839 as float) END AS PctPos_retnet#93996355, CASE WHEN ((TR_retnet#93995840 = NA) OR (TR_retnet#93995840 = null)) THEN null ELSE cast(TR_retnet#93995840 as float) END AS TR_retnet#93996356, CASE WHEN ((IR_retnet#93995841 = NA) OR (IR_retnet#93995841 = null)) THEN null ELSE cast(IR_retnet#93995841 as float) END AS IR_retnet#93996357, ... 2 more fields] : +- FileScan csv [sector_id#93995818,retIC#93995819,resretIC#93995820,numcos#93995821,numdates#93995822,annual_bmret#93995823,annual_ret#93995824,std_ret#93995825,Sharpe_ret#93995826,PctPos_ret#93995827,TR_ret#93995828,IR_ret#93995829,annual_resret#93995830,std_resret#93995831,Sharpe_resret#93995832,PctPos_resret#93995833,TR_resret#93995834,IR_resret#93995835,annual_retnet#93995836,std_retnet#93995837,Sharpe_retnet#93995838,PctPos_retnet#93995839,TR_retnet#93995840,IR_retnet#93995841,... 2 more fields] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/output/estimize_signal_histor..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<sector_id:string,retIC:string,resretIC:string,numcos:string,numdates:string,annual_bmret:s... +- *(2) Filter isnotnull(sector_id#93880529) +- InMemoryTableScan [sector_id#93880529, sort#93880530, description#93880532], [isnotnull(sector_id#93880529)] +- InMemoryRelation [sector_id#93880529, sort#93880530, description#93880532, universe#93880534], StorageLevel(disk, memory, deserialized, 1 replicas) +- *(1) Project [CASE WHEN ((sector_id#93880497 = NA) OR (sector_id#93880497 = null)) THEN null ELSE cast(sector_id#93880497 as int) END AS sector_id#93880529, CASE WHEN (sort#93880499 = null) THEN null ELSE sort#93880499 END AS sort#93880530, CASE WHEN (description#93880501 = null) THEN null ELSE description#93880501 END AS description#93880532, CASE WHEN ((universe#93880503 = NA) OR (universe#93880503 = null)) THEN null ELSE cast(universe#93880503 as int) END AS universe#93880534] +- FileScan csv [sector_id#93880497,sort#93880499,description#93880501,universe#93880503] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/curate/curate_sector.csv], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<sector_id:string,sort:string,description:string,universe:string> ,None), [sort#93880530 ASC NULLS FIRST, description#93880532 ASC NULLS FIRST] (3) InMemoryTableScan Output [4]: [coverage#93996359, numcos#93996112, numdates#93996115, sector_id#93996102] Arguments: [coverage#93996359, numcos#93996112, numdates#93996115, sector_id#93996102], [isnotnull(sector_id#93996102)] (4) InMemoryRelation Arguments: [sector_id#93996102, retIC#93996106, resretIC#93996109, numcos#93996112, numdates#93996115, annual_bmret#93996127, annual_ret#93996128, std_ret#93996129, Sharpe_ret#93996130, PctPos_ret#93996131, TR_ret#93996132, IR_ret#93996133, annual_resret#93996134, std_resret#93996135, Sharpe_resret#93996136, PctPos_resret#93996139, TR_resret#93996141, IR_resret#93996143, annual_retnet#93996200, std_retnet#93996202, Sharpe_retnet#93996354, PctPos_retnet#93996355, TR_retnet#93996356, IR_retnet#93996357, ... 2 more fields], CachedRDDBuilder(org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer@208e3fd9,StorageLevel(disk, memory, deserialized, 1 replicas),*(1) Project [CASE WHEN ((sector_id#93995818 = NA) OR (sector_id#93995818 = null)) THEN null ELSE cast(sector_id#93995818 as int) END AS sector_id#93996102, CASE WHEN ((retIC#93995819 = NA) OR (retIC#93995819 = null)) THEN null ELSE cast(retIC#93995819 as float) END AS retIC#93996106, CASE WHEN ((resretIC#93995820 = NA) OR (resretIC#93995820 = null)) THEN null ELSE cast(resretIC#93995820 as float) END AS resretIC#93996109, CASE WHEN ((numcos#93995821 = NA) OR (numcos#93995821 = null)) THEN null ELSE cast(numcos#93995821 as float) END AS numcos#93996112, CASE WHEN ((numdates#93995822 = NA) OR (numdates#93995822 = null)) THEN null ELSE cast(numdates#93995822 as float) END AS numdates#93996115, CASE WHEN ((annual_bmret#93995823 = NA) OR (annual_bmret#93995823 = null)) THEN null ELSE cast(annual_bmret#93995823 as float) END AS annual_bmret#93996127, CASE WHEN ((annual_ret#93995824 = NA) OR (annual_ret#93995824 = null)) THEN null ELSE cast(annual_ret#93995824 as float) END AS annual_ret#93996128, CASE WHEN ((std_ret#93995825 = NA) OR (std_ret#93995825 = null)) THEN null ELSE cast(std_ret#93995825 as float) END AS std_ret#93996129, CASE WHEN ((Sharpe_ret#93995826 = NA) OR (Sharpe_ret#93995826 = null)) THEN null ELSE cast(Sharpe_ret#93995826 as float) END AS Sharpe_ret#93996130, CASE WHEN ((PctPos_ret#93995827 = NA) OR (PctPos_ret#93995827 = null)) THEN null ELSE cast(PctPos_ret#93995827 as float) END AS PctPos_ret#93996131, CASE WHEN ((TR_ret#93995828 = NA) OR (TR_ret#93995828 = null)) THEN null ELSE cast(TR_ret#93995828 as float) END AS TR_ret#93996132, CASE WHEN ((IR_ret#93995829 = NA) OR (IR_ret#93995829 = null)) THEN null ELSE cast(IR_ret#93995829 as float) END AS IR_ret#93996133, CASE WHEN ((annual_resret#93995830 = NA) OR (annual_resret#93995830 = null)) THEN null ELSE cast(annual_resret#93995830 as float) END AS annual_resret#93996134, CASE WHEN ((std_resret#93995831 = NA) OR (std_resret#93995831 = null)) THEN null ELSE cast(std_resret#93995831 as float) END AS std_resret#93996135, CASE WHEN ((Sharpe_resret#93995832 = NA) OR (Sharpe_resret#93995832 = null)) THEN null ELSE cast(Sharpe_resret#93995832 as float) END AS Sharpe_resret#93996136, CASE WHEN ((PctPos_resret#93995833 = NA) OR (PctPos_resret#93995833 = null)) THEN null ELSE cast(PctPos_resret#93995833 as float) END AS PctPos_resret#93996139, CASE WHEN ((TR_resret#93995834 = NA) OR (TR_resret#93995834 = null)) THEN null ELSE cast(TR_resret#93995834 as float) END AS TR_resret#93996141, CASE WHEN ((IR_resret#93995835 = NA) OR (IR_resret#93995835 = null)) THEN null ELSE cast(IR_resret#93995835 as float) END AS IR_resret#93996143, CASE WHEN ((annual_retnet#93995836 = NA) OR (annual_retnet#93995836 = null)) THEN null ELSE cast(annual_retnet#93995836 as float) END AS annual_retnet#93996200, CASE WHEN ((std_retnet#93995837 = NA) OR (std_retnet#93995837 = null)) THEN null ELSE cast(std_retnet#93995837 as float) END AS std_retnet#93996202, CASE WHEN ((Sharpe_retnet#93995838 = NA) OR (Sharpe_retnet#93995838 = null)) THEN null ELSE cast(Sharpe_retnet#93995838 as float) END AS Sharpe_retnet#93996354, CASE WHEN ((PctPos_retnet#93995839 = NA) OR (PctPos_retnet#93995839 = null)) THEN null ELSE cast(PctPos_retnet#93995839 as float) END AS PctPos_retnet#93996355, CASE WHEN ((TR_retnet#93995840 = NA) OR (TR_retnet#93995840 = null)) THEN null ELSE cast(TR_retnet#93995840 as float) END AS TR_retnet#93996356, CASE WHEN ((IR_retnet#93995841 = NA) OR (IR_retnet#93995841 = null)) THEN null ELSE cast(IR_retnet#93995841 as float) END AS IR_retnet#93996357, ... 2 more fields] +- FileScan csv [sector_id#93995818,retIC#93995819,resretIC#93995820,numcos#93995821,numdates#93995822,annual_bmret#93995823,annual_ret#93995824,std_ret#93995825,Sharpe_ret#93995826,PctPos_ret#93995827,TR_ret#93995828,IR_ret#93995829,annual_resret#93995830,std_resret#93995831,Sharpe_resret#93995832,PctPos_resret#93995833,TR_resret#93995834,IR_resret#93995835,annual_retnet#93995836,std_retnet#93995837,Sharpe_retnet#93995838,PctPos_retnet#93995839,TR_retnet#93995840,IR_retnet#93995841,... 2 more fields] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/output/estimize_signal_histor..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<sector_id:string,retIC:string,resretIC:string,numcos:string,numdates:string,annual_bmret:s... ,None) (5) Scan csv Output [26]: [sector_id#93995818, retIC#93995819, resretIC#93995820, numcos#93995821, numdates#93995822, annual_bmret#93995823, annual_ret#93995824, std_ret#93995825, Sharpe_ret#93995826, PctPos_ret#93995827, TR_ret#93995828, IR_ret#93995829, annual_resret#93995830, std_resret#93995831, Sharpe_resret#93995832, PctPos_resret#93995833, TR_resret#93995834, IR_resret#93995835, annual_retnet#93995836, std_retnet#93995837, Sharpe_retnet#93995838, PctPos_retnet#93995839, TR_retnet#93995840, IR_retnet#93995841, turnover#93995842, coverage#93995843] Batched: false Location: InMemoryFileIndex [file:/srv/plusamp/data/default/ea-market/output/estimize_signal_history/estimizesignal_preearnings/stats_sector_id.csv] ReadSchema: struct<sector_id:string,retIC:string,resretIC:string,numcos:string,numdates:string,annual_bmret:string,annual_ret:string,std_ret:string,Sharpe_ret:string,PctPos_ret:string,TR_ret:string,IR_ret:string,annual_resret:string,std_resret:string,Sharpe_resret:string,PctPos_resret:string,TR_resret:string,IR_resret:string,annual_retnet:string,std_retnet:string,Sharpe_retnet:string,PctPos_retnet:string,TR_retnet:string,IR_retnet:string,turnover:string,coverage:string> (6) Project [codegen id : 1] Output [26]: [CASE WHEN ((sector_id#93995818 = NA) OR (sector_id#93995818 = null)) THEN null ELSE cast(sector_id#93995818 as int) END AS sector_id#93996102, CASE WHEN ((retIC#93995819 = NA) OR (retIC#93995819 = null)) THEN null ELSE cast(retIC#93995819 as float) END AS retIC#93996106, CASE WHEN ((resretIC#93995820 = NA) OR (resretIC#93995820 = null)) THEN null ELSE cast(resretIC#93995820 as float) END AS resretIC#93996109, CASE WHEN ((numcos#93995821 = NA) OR (numcos#93995821 = null)) THEN null ELSE cast(numcos#93995821 as float) END AS numcos#93996112, CASE WHEN ((numdates#93995822 = NA) OR (numdates#93995822 = null)) THEN null ELSE cast(numdates#93995822 as float) END AS numdates#93996115, CASE WHEN ((annual_bmret#93995823 = NA) OR (annual_bmret#93995823 = null)) THEN null ELSE cast(annual_bmret#93995823 as float) END AS annual_bmret#93996127, CASE WHEN ((annual_ret#93995824 = NA) OR (annual_ret#93995824 = null)) THEN null ELSE cast(annual_ret#93995824 as float) END AS annual_ret#93996128, CASE WHEN ((std_ret#93995825 = NA) OR (std_ret#93995825 = null)) THEN null ELSE cast(std_ret#93995825 as float) END AS std_ret#93996129, CASE WHEN ((Sharpe_ret#93995826 = NA) OR (Sharpe_ret#93995826 = null)) THEN null ELSE cast(Sharpe_ret#93995826 as float) END AS Sharpe_ret#93996130, CASE WHEN ((PctPos_ret#93995827 = NA) OR (PctPos_ret#93995827 = null)) THEN null ELSE cast(PctPos_ret#93995827 as float) END AS PctPos_ret#93996131, CASE WHEN ((TR_ret#93995828 = NA) OR (TR_ret#93995828 = null)) THEN null ELSE cast(TR_ret#93995828 as float) END AS TR_ret#93996132, CASE WHEN ((IR_ret#93995829 = NA) OR (IR_ret#93995829 = null)) THEN null ELSE cast(IR_ret#93995829 as float) END AS IR_ret#93996133, CASE WHEN ((annual_resret#93995830 = NA) OR (annual_resret#93995830 = null)) THEN null ELSE cast(annual_resret#93995830 as float) END AS annual_resret#93996134, CASE WHEN ((std_resret#93995831 = NA) OR (std_resret#93995831 = null)) THEN null ELSE cast(std_resret#93995831 as float) END AS std_resret#93996135, CASE WHEN ((Sharpe_resret#93995832 = NA) OR (Sharpe_resret#93995832 = null)) THEN null ELSE cast(Sharpe_resret#93995832 as float) END AS Sharpe_resret#93996136, CASE WHEN ((PctPos_resret#93995833 = NA) OR (PctPos_resret#93995833 = null)) THEN null ELSE cast(PctPos_resret#93995833 as float) END AS PctPos_resret#93996139, CASE WHEN ((TR_resret#93995834 = NA) OR (TR_resret#93995834 = null)) THEN null ELSE cast(TR_resret#93995834 as float) END AS TR_resret#93996141, CASE WHEN ((IR_resret#93995835 = NA) OR (IR_resret#93995835 = null)) THEN null ELSE cast(IR_resret#93995835 as float) END AS IR_resret#93996143, CASE WHEN ((annual_retnet#93995836 = NA) OR (annual_retnet#93995836 = null)) THEN null ELSE cast(annual_retnet#93995836 as float) END AS annual_retnet#93996200, CASE WHEN ((std_retnet#93995837 = NA) OR (std_retnet#93995837 = null)) THEN null ELSE cast(std_retnet#93995837 as float) END AS std_retnet#93996202, CASE WHEN ((Sharpe_retnet#93995838 = NA) OR (Sharpe_retnet#93995838 = null)) THEN null ELSE cast(Sharpe_retnet#93995838 as float) END AS Sharpe_retnet#93996354, CASE WHEN ((PctPos_retnet#93995839 = NA) OR (PctPos_retnet#93995839 = null)) THEN null ELSE cast(PctPos_retnet#93995839 as float) END AS PctPos_retnet#93996355, CASE WHEN ((TR_retnet#93995840 = NA) OR (TR_retnet#93995840 = null)) THEN null ELSE cast(TR_retnet#93995840 as float) END AS TR_retnet#93996356, CASE WHEN ((IR_retnet#93995841 = NA) OR (IR_retnet#93995841 = null)) THEN null ELSE cast(IR_retnet#93995841 as float) END AS IR_retnet#93996357, CASE WHEN ((turnover#93995842 = NA) OR (turnover#93995842 = null)) THEN null ELSE cast(turnover#93995842 as float) END AS turnover#93996358, CASE WHEN ((coverage#93995843 = NA) OR (coverage#93995843 = null)) THEN null ELSE cast(coverage#93995843 as float) END AS coverage#93996359] Input [26]: [sector_id#93995818, retIC#93995819, resretIC#93995820, numcos#93995821, numdates#93995822, annual_bmret#93995823, annual_ret#93995824, std_ret#93995825, Sharpe_ret#93995826, PctPos_ret#93995827, TR_ret#93995828, IR_ret#93995829, annual_resret#93995830, std_resret#93995831, Sharpe_resret#93995832, PctPos_resret#93995833, TR_resret#93995834, IR_resret#93995835, annual_retnet#93995836, std_retnet#93995837, Sharpe_retnet#93995838, PctPos_retnet#93995839, TR_retnet#93995840, IR_retnet#93995841, turnover#93995842, coverage#93995843] (7) ColumnarToRow [codegen id : 1] Input [4]: [coverage#93996359, numcos#93996112, numdates#93996115, sector_id#93996102] (8) Filter [codegen id : 1] Input [4]: [coverage#93996359, numcos#93996112, numdates#93996115, sector_id#93996102] Condition : isnotnull(sector_id#93996102) (9) Project [codegen id : 1] Output [5]: [sector_id#93996102, numcos#93996112, numdates#93996115, coverage#93996359, round((cast(numcos#93996112 as double) / cast(coverage#93996359 as double)), 0) AS universe#93996447] Input [4]: [coverage#93996359, numcos#93996112, numdates#93996115, sector_id#93996102] (10) BroadcastExchange Input [5]: [sector_id#93996102, numcos#93996112, numdates#93996115, coverage#93996359, universe#93996447] Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#7504891] (11) InMemoryTableScan Output [3]: [sector_id#93880529, sort#93880530, description#93880532] Arguments: [sector_id#93880529, sort#93880530, description#93880532], [isnotnull(sector_id#93880529)] (12) InMemoryRelation Arguments: [sector_id#93880529, sort#93880530, description#93880532, universe#93880534], CachedRDDBuilder(org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer@208e3fd9,StorageLevel(disk, memory, deserialized, 1 replicas),*(1) Project [CASE WHEN ((sector_id#93880497 = NA) OR (sector_id#93880497 = null)) THEN null ELSE cast(sector_id#93880497 as int) END AS sector_id#93880529, CASE WHEN (sort#93880499 = null) THEN null ELSE sort#93880499 END AS sort#93880530, CASE WHEN (description#93880501 = null) THEN null ELSE description#93880501 END AS description#93880532, CASE WHEN ((universe#93880503 = NA) OR (universe#93880503 = null)) THEN null ELSE cast(universe#93880503 as int) END AS universe#93880534] +- FileScan csv [sector_id#93880497,sort#93880499,description#93880501,universe#93880503] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/curate/curate_sector.csv], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<sector_id:string,sort:string,description:string,universe:string> ,None) (13) Scan csv Output [4]: [sector_id#93880497, sort#93880499, description#93880501, universe#93880503] Batched: false Location: InMemoryFileIndex [file:/srv/plusamp/data/default/ea-market/curate/curate_sector.csv] ReadSchema: struct<sector_id:string,sort:string,description:string,universe:string> (14) Project [codegen id : 1] Output [4]: [CASE WHEN ((sector_id#93880497 = NA) OR (sector_id#93880497 = null)) THEN null ELSE cast(sector_id#93880497 as int) END AS sector_id#93880529, CASE WHEN (sort#93880499 = null) THEN null ELSE sort#93880499 END AS sort#93880530, CASE WHEN (description#93880501 = null) THEN null ELSE description#93880501 END AS description#93880532, CASE WHEN ((universe#93880503 = NA) OR (universe#93880503 = null)) THEN null ELSE cast(universe#93880503 as int) END AS universe#93880534] Input [4]: [sector_id#93880497, sort#93880499, description#93880501, universe#93880503] (15) Filter Input [3]: [sector_id#93880529, sort#93880530, description#93880532] Condition : isnotnull(sector_id#93880529) (16) BroadcastHashJoin [codegen id : 2] Left keys [1]: [sector_id#93996102] Right keys [1]: [sector_id#93880529] Join condition: None (17) Project [codegen id : 2] Output [7]: [sector_id#93996102, numcos#93996112, numdates#93996115, sort#93880530, description#93880532, universe#93996447, coverage#93996359] Input [8]: [sector_id#93996102, numcos#93996112, numdates#93996115, coverage#93996359, universe#93996447, sector_id#93880529, sort#93880530, description#93880532] (18) Exchange Input [7]: [sector_id#93996102, numcos#93996112, numdates#93996115, sort#93880530, description#93880532, universe#93996447, coverage#93996359] Arguments: rangepartitioning(sort#93880530 ASC NULLS FIRST, description#93880532 ASC NULLS FIRST, 200), ENSURE_REQUIREMENTS, [id=#7504898] (19) Sort [codegen id : 3] Input [7]: [sector_id#93996102, numcos#93996112, numdates#93996115, sort#93880530, description#93880532, universe#93996447, coverage#93996359] Arguments: [sort#93880530 ASC NULLS FIRST, description#93880532 ASC NULLS FIRST], true, 0 (20) CollectLimit Input [7]: [sector_id#93996102, numcos#93996112, numdates#93996115, sort#93880530, description#93880532, universe#93996447, coverage#93996359] Arguments: 1000000