== Physical Plan == CollectLimit (20) +- InMemoryTableScan (1) +- InMemoryRelation (2) +- * Sort (19) +- Exchange (18) +- * Project (17) +- * BroadcastHashJoin Inner BuildLeft (16) :- BroadcastExchange (10) : +- * Project (9) : +- * Filter (8) : +- * ColumnarToRow (7) : +- InMemoryTableScan (3) : +- InMemoryRelation (4) : +- * Project (6) : +- Scan csv (5) +- * Filter (15) +- InMemoryTableScan (11) +- InMemoryRelation (12) +- * Project (14) +- Scan csv (13) (1) InMemoryTableScan Output [7]: [sector_id#94268085, numcos#94268096, numdates#94268099, sort#94160419, description#94160423, universe#94268444, coverage#94268349] Arguments: [sector_id#94268085, numcos#94268096, numdates#94268099, sort#94160419, description#94160423, universe#94268444, coverage#94268349] (2) InMemoryRelation Arguments: [sector_id#94268085, numcos#94268096, numdates#94268099, sort#94160419, description#94160423, universe#94268444, coverage#94268349], CachedRDDBuilder(org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer@208e3fd9,StorageLevel(disk, memory, deserialized, 1 replicas),*(3) Sort [sort#94160419 ASC NULLS FIRST, description#94160423 ASC NULLS FIRST], true, 0 +- Exchange rangepartitioning(sort#94160419 ASC NULLS FIRST, description#94160423 ASC NULLS FIRST, 200), ENSURE_REQUIREMENTS, [id=#7526503] +- *(2) Project [sector_id#94268085, numcos#94268096, numdates#94268099, sort#94160419, description#94160423, universe#94268444, coverage#94268349] +- *(2) BroadcastHashJoin [sector_id#94268085], [sector_id#94160418], Inner, BuildLeft, false :- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#7526496] : +- *(1) Project [sector_id#94268085, numcos#94268096, numdates#94268099, coverage#94268349, round((cast(numcos#94268096 as double) / cast(coverage#94268349 as double)), 0) AS universe#94268444] : +- *(1) Filter isnotnull(sector_id#94268085) : +- *(1) ColumnarToRow : +- InMemoryTableScan [coverage#94268349, numcos#94268096, numdates#94268099, sector_id#94268085], [isnotnull(sector_id#94268085)] : +- InMemoryRelation [sector_id#94268085, retIC#94268090, resretIC#94268094, numcos#94268096, numdates#94268099, annual_bmret#94268103, annual_ret#94268107, std_ret#94268111, Sharpe_ret#94268115, PctPos_ret#94268120, TR_ret#94268124, IR_ret#94268126, annual_resret#94268128, std_resret#94268141, Sharpe_resret#94268270, PctPos_resret#94268271, TR_resret#94268284, IR_resret#94268285, annual_retnet#94268286, std_retnet#94268298, Sharpe_retnet#94268300, PctPos_retnet#94268301, TR_retnet#94268316, IR_retnet#94268330, ... 2 more fields], StorageLevel(disk, memory, deserialized, 1 replicas) : +- *(1) Project [CASE WHEN ((sector_id#94267879 = NA) OR (sector_id#94267879 = null)) THEN null ELSE cast(sector_id#94267879 as int) END AS sector_id#94268085, CASE WHEN ((retIC#94267880 = NA) OR (retIC#94267880 = null)) THEN null ELSE cast(retIC#94267880 as float) END AS retIC#94268090, CASE WHEN ((resretIC#94267881 = NA) OR (resretIC#94267881 = null)) THEN null ELSE cast(resretIC#94267881 as float) END AS resretIC#94268094, CASE WHEN ((numcos#94267882 = NA) OR (numcos#94267882 = null)) THEN null ELSE cast(numcos#94267882 as float) END AS numcos#94268096, CASE WHEN ((numdates#94267883 = NA) OR (numdates#94267883 = null)) THEN null ELSE cast(numdates#94267883 as float) END AS numdates#94268099, CASE WHEN ((annual_bmret#94267884 = NA) OR (annual_bmret#94267884 = null)) THEN null ELSE cast(annual_bmret#94267884 as float) END AS annual_bmret#94268103, CASE WHEN ((annual_ret#94267885 = NA) OR (annual_ret#94267885 = null)) THEN null ELSE cast(annual_ret#94267885 as float) END AS annual_ret#94268107, CASE WHEN ((std_ret#94267886 = NA) OR (std_ret#94267886 = null)) THEN null ELSE cast(std_ret#94267886 as float) END AS std_ret#94268111, CASE WHEN ((Sharpe_ret#94267887 = NA) OR (Sharpe_ret#94267887 = null)) THEN null ELSE cast(Sharpe_ret#94267887 as float) END AS Sharpe_ret#94268115, CASE WHEN ((PctPos_ret#94267888 = NA) OR (PctPos_ret#94267888 = null)) THEN null ELSE cast(PctPos_ret#94267888 as float) END AS PctPos_ret#94268120, CASE WHEN ((TR_ret#94267889 = NA) OR (TR_ret#94267889 = null)) THEN null ELSE cast(TR_ret#94267889 as float) END AS TR_ret#94268124, CASE WHEN ((IR_ret#94267890 = NA) OR (IR_ret#94267890 = null)) THEN null ELSE cast(IR_ret#94267890 as float) END AS IR_ret#94268126, CASE WHEN ((annual_resret#94267891 = NA) OR (annual_resret#94267891 = null)) THEN null ELSE cast(annual_resret#94267891 as float) END AS annual_resret#94268128, CASE WHEN ((std_resret#94267892 = NA) OR (std_resret#94267892 = null)) THEN null ELSE cast(std_resret#94267892 as float) END AS std_resret#94268141, CASE WHEN ((Sharpe_resret#94267893 = NA) OR (Sharpe_resret#94267893 = null)) THEN null ELSE cast(Sharpe_resret#94267893 as float) END AS Sharpe_resret#94268270, CASE WHEN ((PctPos_resret#94267894 = NA) OR (PctPos_resret#94267894 = null)) THEN null ELSE cast(PctPos_resret#94267894 as float) END AS PctPos_resret#94268271, CASE WHEN ((TR_resret#94267895 = NA) OR (TR_resret#94267895 = null)) THEN null ELSE cast(TR_resret#94267895 as float) END AS TR_resret#94268284, CASE WHEN ((IR_resret#94267896 = NA) OR (IR_resret#94267896 = null)) THEN null ELSE cast(IR_resret#94267896 as float) END AS IR_resret#94268285, CASE WHEN ((annual_retnet#94267897 = NA) OR (annual_retnet#94267897 = null)) THEN null ELSE cast(annual_retnet#94267897 as float) END AS annual_retnet#94268286, CASE WHEN ((std_retnet#94267898 = NA) OR (std_retnet#94267898 = null)) THEN null ELSE cast(std_retnet#94267898 as float) END AS std_retnet#94268298, CASE WHEN ((Sharpe_retnet#94267899 = NA) OR (Sharpe_retnet#94267899 = null)) THEN null ELSE cast(Sharpe_retnet#94267899 as float) END AS Sharpe_retnet#94268300, CASE WHEN ((PctPos_retnet#94267900 = NA) OR (PctPos_retnet#94267900 = null)) THEN null ELSE cast(PctPos_retnet#94267900 as float) END AS PctPos_retnet#94268301, CASE WHEN ((TR_retnet#94267901 = NA) OR (TR_retnet#94267901 = null)) THEN null ELSE cast(TR_retnet#94267901 as float) END AS TR_retnet#94268316, CASE WHEN ((IR_retnet#94267902 = NA) OR (IR_retnet#94267902 = null)) THEN null ELSE cast(IR_retnet#94267902 as float) END AS IR_retnet#94268330, ... 2 more fields] : +- FileScan csv [sector_id#94267879,retIC#94267880,resretIC#94267881,numcos#94267882,numdates#94267883,annual_bmret#94267884,annual_ret#94267885,std_ret#94267886,Sharpe_ret#94267887,PctPos_ret#94267888,TR_ret#94267889,IR_ret#94267890,annual_resret#94267891,std_resret#94267892,Sharpe_resret#94267893,PctPos_resret#94267894,TR_resret#94267895,IR_resret#94267896,annual_retnet#94267897,std_retnet#94267898,Sharpe_retnet#94267899,PctPos_retnet#94267900,TR_retnet#94267901,IR_retnet#94267902,... 2 more fields] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/output/digital_revenue_signal..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<sector_id:string,retIC:string,resretIC:string,numcos:string,numdates:string,annual_bmret:s... +- *(2) Filter isnotnull(sector_id#94160418) +- InMemoryTableScan [sector_id#94160418, sort#94160419, description#94160423], [isnotnull(sector_id#94160418)] +- InMemoryRelation [sector_id#94160418, sort#94160419, description#94160423, universe#94160424], StorageLevel(disk, memory, deserialized, 1 replicas) +- *(1) Project [CASE WHEN ((sector_id#94160398 = NA) OR (sector_id#94160398 = null)) THEN null ELSE cast(sector_id#94160398 as int) END AS sector_id#94160418, CASE WHEN (sort#94160399 = null) THEN null ELSE sort#94160399 END AS sort#94160419, CASE WHEN (description#94160400 = null) THEN null ELSE description#94160400 END AS description#94160423, CASE WHEN ((universe#94160401 = NA) OR (universe#94160401 = null)) THEN null ELSE cast(universe#94160401 as int) END AS universe#94160424] +- FileScan csv [sector_id#94160398,sort#94160399,description#94160400,universe#94160401] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/curate/curate_sector.csv], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<sector_id:string,sort:string,description:string,universe:string> ,None), [sort#94160419 ASC NULLS FIRST, description#94160423 ASC NULLS FIRST] (3) InMemoryTableScan Output [4]: [coverage#94268349, numcos#94268096, numdates#94268099, sector_id#94268085] Arguments: [coverage#94268349, numcos#94268096, numdates#94268099, sector_id#94268085], [isnotnull(sector_id#94268085)] (4) InMemoryRelation Arguments: [sector_id#94268085, retIC#94268090, resretIC#94268094, numcos#94268096, numdates#94268099, annual_bmret#94268103, annual_ret#94268107, std_ret#94268111, Sharpe_ret#94268115, PctPos_ret#94268120, TR_ret#94268124, IR_ret#94268126, annual_resret#94268128, std_resret#94268141, Sharpe_resret#94268270, PctPos_resret#94268271, TR_resret#94268284, IR_resret#94268285, annual_retnet#94268286, std_retnet#94268298, Sharpe_retnet#94268300, PctPos_retnet#94268301, TR_retnet#94268316, IR_retnet#94268330, ... 2 more fields], CachedRDDBuilder(org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer@208e3fd9,StorageLevel(disk, memory, deserialized, 1 replicas),*(1) Project [CASE WHEN ((sector_id#94267879 = NA) OR (sector_id#94267879 = null)) THEN null ELSE cast(sector_id#94267879 as int) END AS sector_id#94268085, CASE WHEN ((retIC#94267880 = NA) OR (retIC#94267880 = null)) THEN null ELSE cast(retIC#94267880 as float) END AS retIC#94268090, CASE WHEN ((resretIC#94267881 = NA) OR (resretIC#94267881 = null)) THEN null ELSE cast(resretIC#94267881 as float) END AS resretIC#94268094, CASE WHEN ((numcos#94267882 = NA) OR (numcos#94267882 = null)) THEN null ELSE cast(numcos#94267882 as float) END AS numcos#94268096, CASE WHEN ((numdates#94267883 = NA) OR (numdates#94267883 = null)) THEN null ELSE cast(numdates#94267883 as float) END AS numdates#94268099, CASE WHEN ((annual_bmret#94267884 = NA) OR (annual_bmret#94267884 = null)) THEN null ELSE cast(annual_bmret#94267884 as float) END AS annual_bmret#94268103, CASE WHEN ((annual_ret#94267885 = NA) OR (annual_ret#94267885 = null)) THEN null ELSE cast(annual_ret#94267885 as float) END AS annual_ret#94268107, CASE WHEN ((std_ret#94267886 = NA) OR (std_ret#94267886 = null)) THEN null ELSE cast(std_ret#94267886 as float) END AS std_ret#94268111, CASE WHEN ((Sharpe_ret#94267887 = NA) OR (Sharpe_ret#94267887 = null)) THEN null ELSE cast(Sharpe_ret#94267887 as float) END AS Sharpe_ret#94268115, CASE WHEN ((PctPos_ret#94267888 = NA) OR (PctPos_ret#94267888 = null)) THEN null ELSE cast(PctPos_ret#94267888 as float) END AS PctPos_ret#94268120, CASE WHEN ((TR_ret#94267889 = NA) OR (TR_ret#94267889 = null)) THEN null ELSE cast(TR_ret#94267889 as float) END AS TR_ret#94268124, CASE WHEN ((IR_ret#94267890 = NA) OR (IR_ret#94267890 = null)) THEN null ELSE cast(IR_ret#94267890 as float) END AS IR_ret#94268126, CASE WHEN ((annual_resret#94267891 = NA) OR (annual_resret#94267891 = null)) THEN null ELSE cast(annual_resret#94267891 as float) END AS annual_resret#94268128, CASE WHEN ((std_resret#94267892 = NA) OR (std_resret#94267892 = null)) THEN null ELSE cast(std_resret#94267892 as float) END AS std_resret#94268141, CASE WHEN ((Sharpe_resret#94267893 = NA) OR (Sharpe_resret#94267893 = null)) THEN null ELSE cast(Sharpe_resret#94267893 as float) END AS Sharpe_resret#94268270, CASE WHEN ((PctPos_resret#94267894 = NA) OR (PctPos_resret#94267894 = null)) THEN null ELSE cast(PctPos_resret#94267894 as float) END AS PctPos_resret#94268271, CASE WHEN ((TR_resret#94267895 = NA) OR (TR_resret#94267895 = null)) THEN null ELSE cast(TR_resret#94267895 as float) END AS TR_resret#94268284, CASE WHEN ((IR_resret#94267896 = NA) OR (IR_resret#94267896 = null)) THEN null ELSE cast(IR_resret#94267896 as float) END AS IR_resret#94268285, CASE WHEN ((annual_retnet#94267897 = NA) OR (annual_retnet#94267897 = null)) THEN null ELSE cast(annual_retnet#94267897 as float) END AS annual_retnet#94268286, CASE WHEN ((std_retnet#94267898 = NA) OR (std_retnet#94267898 = null)) THEN null ELSE cast(std_retnet#94267898 as float) END AS std_retnet#94268298, CASE WHEN ((Sharpe_retnet#94267899 = NA) OR (Sharpe_retnet#94267899 = null)) THEN null ELSE cast(Sharpe_retnet#94267899 as float) END AS Sharpe_retnet#94268300, CASE WHEN ((PctPos_retnet#94267900 = NA) OR (PctPos_retnet#94267900 = null)) THEN null ELSE cast(PctPos_retnet#94267900 as float) END AS PctPos_retnet#94268301, CASE WHEN ((TR_retnet#94267901 = NA) OR (TR_retnet#94267901 = null)) THEN null ELSE cast(TR_retnet#94267901 as float) END AS TR_retnet#94268316, CASE WHEN ((IR_retnet#94267902 = NA) OR (IR_retnet#94267902 = null)) THEN null ELSE cast(IR_retnet#94267902 as float) END AS IR_retnet#94268330, ... 2 more fields] +- FileScan csv [sector_id#94267879,retIC#94267880,resretIC#94267881,numcos#94267882,numdates#94267883,annual_bmret#94267884,annual_ret#94267885,std_ret#94267886,Sharpe_ret#94267887,PctPos_ret#94267888,TR_ret#94267889,IR_ret#94267890,annual_resret#94267891,std_resret#94267892,Sharpe_resret#94267893,PctPos_resret#94267894,TR_resret#94267895,IR_resret#94267896,annual_retnet#94267897,std_retnet#94267898,Sharpe_retnet#94267899,PctPos_retnet#94267900,TR_retnet#94267901,IR_retnet#94267902,... 2 more fields] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/output/digital_revenue_signal..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<sector_id:string,retIC:string,resretIC:string,numcos:string,numdates:string,annual_bmret:s... ,None) (5) Scan csv Output [26]: [sector_id#94267879, retIC#94267880, resretIC#94267881, numcos#94267882, numdates#94267883, annual_bmret#94267884, annual_ret#94267885, std_ret#94267886, Sharpe_ret#94267887, PctPos_ret#94267888, TR_ret#94267889, IR_ret#94267890, annual_resret#94267891, std_resret#94267892, Sharpe_resret#94267893, PctPos_resret#94267894, TR_resret#94267895, IR_resret#94267896, annual_retnet#94267897, std_retnet#94267898, Sharpe_retnet#94267899, PctPos_retnet#94267900, TR_retnet#94267901, IR_retnet#94267902, turnover#94267903, coverage#94267904] Batched: false Location: InMemoryFileIndex [file:/srv/plusamp/data/default/ea-market/output/digital_revenue_signal/rev_signal_percentile_r_100/stats_sector_id.csv] ReadSchema: struct<sector_id:string,retIC:string,resretIC:string,numcos:string,numdates:string,annual_bmret:string,annual_ret:string,std_ret:string,Sharpe_ret:string,PctPos_ret:string,TR_ret:string,IR_ret:string,annual_resret:string,std_resret:string,Sharpe_resret:string,PctPos_resret:string,TR_resret:string,IR_resret:string,annual_retnet:string,std_retnet:string,Sharpe_retnet:string,PctPos_retnet:string,TR_retnet:string,IR_retnet:string,turnover:string,coverage:string> (6) Project [codegen id : 1] Output [26]: [CASE WHEN ((sector_id#94267879 = NA) OR (sector_id#94267879 = null)) THEN null ELSE cast(sector_id#94267879 as int) END AS sector_id#94268085, CASE WHEN ((retIC#94267880 = NA) OR (retIC#94267880 = null)) THEN null ELSE cast(retIC#94267880 as float) END AS retIC#94268090, CASE WHEN ((resretIC#94267881 = NA) OR (resretIC#94267881 = null)) THEN null ELSE cast(resretIC#94267881 as float) END AS resretIC#94268094, CASE WHEN ((numcos#94267882 = NA) OR (numcos#94267882 = null)) THEN null ELSE cast(numcos#94267882 as float) END AS numcos#94268096, CASE WHEN ((numdates#94267883 = NA) OR (numdates#94267883 = null)) THEN null ELSE cast(numdates#94267883 as float) END AS numdates#94268099, CASE WHEN ((annual_bmret#94267884 = NA) OR (annual_bmret#94267884 = null)) THEN null ELSE cast(annual_bmret#94267884 as float) END AS annual_bmret#94268103, CASE WHEN ((annual_ret#94267885 = NA) OR (annual_ret#94267885 = null)) THEN null ELSE cast(annual_ret#94267885 as float) END AS annual_ret#94268107, CASE WHEN ((std_ret#94267886 = NA) OR (std_ret#94267886 = null)) THEN null ELSE cast(std_ret#94267886 as float) END AS std_ret#94268111, CASE WHEN ((Sharpe_ret#94267887 = NA) OR (Sharpe_ret#94267887 = null)) THEN null ELSE cast(Sharpe_ret#94267887 as float) END AS Sharpe_ret#94268115, CASE WHEN ((PctPos_ret#94267888 = NA) OR (PctPos_ret#94267888 = null)) THEN null ELSE cast(PctPos_ret#94267888 as float) END AS PctPos_ret#94268120, CASE WHEN ((TR_ret#94267889 = NA) OR (TR_ret#94267889 = null)) THEN null ELSE cast(TR_ret#94267889 as float) END AS TR_ret#94268124, CASE WHEN ((IR_ret#94267890 = NA) OR (IR_ret#94267890 = null)) THEN null ELSE cast(IR_ret#94267890 as float) END AS IR_ret#94268126, CASE WHEN ((annual_resret#94267891 = NA) OR (annual_resret#94267891 = null)) THEN null ELSE cast(annual_resret#94267891 as float) END AS annual_resret#94268128, CASE WHEN ((std_resret#94267892 = NA) OR (std_resret#94267892 = null)) THEN null ELSE cast(std_resret#94267892 as float) END AS std_resret#94268141, CASE WHEN ((Sharpe_resret#94267893 = NA) OR (Sharpe_resret#94267893 = null)) THEN null ELSE cast(Sharpe_resret#94267893 as float) END AS Sharpe_resret#94268270, CASE WHEN ((PctPos_resret#94267894 = NA) OR (PctPos_resret#94267894 = null)) THEN null ELSE cast(PctPos_resret#94267894 as float) END AS PctPos_resret#94268271, CASE WHEN ((TR_resret#94267895 = NA) OR (TR_resret#94267895 = null)) THEN null ELSE cast(TR_resret#94267895 as float) END AS TR_resret#94268284, CASE WHEN ((IR_resret#94267896 = NA) OR (IR_resret#94267896 = null)) THEN null ELSE cast(IR_resret#94267896 as float) END AS IR_resret#94268285, CASE WHEN ((annual_retnet#94267897 = NA) OR (annual_retnet#94267897 = null)) THEN null ELSE cast(annual_retnet#94267897 as float) END AS annual_retnet#94268286, CASE WHEN ((std_retnet#94267898 = NA) OR (std_retnet#94267898 = null)) THEN null ELSE cast(std_retnet#94267898 as float) END AS std_retnet#94268298, CASE WHEN ((Sharpe_retnet#94267899 = NA) OR (Sharpe_retnet#94267899 = null)) THEN null ELSE cast(Sharpe_retnet#94267899 as float) END AS Sharpe_retnet#94268300, CASE WHEN ((PctPos_retnet#94267900 = NA) OR (PctPos_retnet#94267900 = null)) THEN null ELSE cast(PctPos_retnet#94267900 as float) END AS PctPos_retnet#94268301, CASE WHEN ((TR_retnet#94267901 = NA) OR (TR_retnet#94267901 = null)) THEN null ELSE cast(TR_retnet#94267901 as float) END AS TR_retnet#94268316, CASE WHEN ((IR_retnet#94267902 = NA) OR (IR_retnet#94267902 = null)) THEN null ELSE cast(IR_retnet#94267902 as float) END AS IR_retnet#94268330, CASE WHEN ((turnover#94267903 = NA) OR (turnover#94267903 = null)) THEN null ELSE cast(turnover#94267903 as float) END AS turnover#94268345, CASE WHEN ((coverage#94267904 = NA) OR (coverage#94267904 = null)) THEN null ELSE cast(coverage#94267904 as float) END AS coverage#94268349] Input [26]: [sector_id#94267879, retIC#94267880, resretIC#94267881, numcos#94267882, numdates#94267883, annual_bmret#94267884, annual_ret#94267885, std_ret#94267886, Sharpe_ret#94267887, PctPos_ret#94267888, TR_ret#94267889, IR_ret#94267890, annual_resret#94267891, std_resret#94267892, Sharpe_resret#94267893, PctPos_resret#94267894, TR_resret#94267895, IR_resret#94267896, annual_retnet#94267897, std_retnet#94267898, Sharpe_retnet#94267899, PctPos_retnet#94267900, TR_retnet#94267901, IR_retnet#94267902, turnover#94267903, coverage#94267904] (7) ColumnarToRow [codegen id : 1] Input [4]: [coverage#94268349, numcos#94268096, numdates#94268099, sector_id#94268085] (8) Filter [codegen id : 1] Input [4]: [coverage#94268349, numcos#94268096, numdates#94268099, sector_id#94268085] Condition : isnotnull(sector_id#94268085) (9) Project [codegen id : 1] Output [5]: [sector_id#94268085, numcos#94268096, numdates#94268099, coverage#94268349, round((cast(numcos#94268096 as double) / cast(coverage#94268349 as double)), 0) AS universe#94268444] Input [4]: [coverage#94268349, numcos#94268096, numdates#94268099, sector_id#94268085] (10) BroadcastExchange Input [5]: [sector_id#94268085, numcos#94268096, numdates#94268099, coverage#94268349, universe#94268444] Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#7526496] (11) InMemoryTableScan Output [3]: [sector_id#94160418, sort#94160419, description#94160423] Arguments: [sector_id#94160418, sort#94160419, description#94160423], [isnotnull(sector_id#94160418)] (12) InMemoryRelation Arguments: [sector_id#94160418, sort#94160419, description#94160423, universe#94160424], CachedRDDBuilder(org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer@208e3fd9,StorageLevel(disk, memory, deserialized, 1 replicas),*(1) Project [CASE WHEN ((sector_id#94160398 = NA) OR (sector_id#94160398 = null)) THEN null ELSE cast(sector_id#94160398 as int) END AS sector_id#94160418, CASE WHEN (sort#94160399 = null) THEN null ELSE sort#94160399 END AS sort#94160419, CASE WHEN (description#94160400 = null) THEN null ELSE description#94160400 END AS description#94160423, CASE WHEN ((universe#94160401 = NA) OR (universe#94160401 = null)) THEN null ELSE cast(universe#94160401 as int) END AS universe#94160424] +- FileScan csv [sector_id#94160398,sort#94160399,description#94160400,universe#94160401] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/curate/curate_sector.csv], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<sector_id:string,sort:string,description:string,universe:string> ,None) (13) Scan csv Output [4]: [sector_id#94160398, sort#94160399, description#94160400, universe#94160401] Batched: false Location: InMemoryFileIndex [file:/srv/plusamp/data/default/ea-market/curate/curate_sector.csv] ReadSchema: struct<sector_id:string,sort:string,description:string,universe:string> (14) Project [codegen id : 1] Output [4]: [CASE WHEN ((sector_id#94160398 = NA) OR (sector_id#94160398 = null)) THEN null ELSE cast(sector_id#94160398 as int) END AS sector_id#94160418, CASE WHEN (sort#94160399 = null) THEN null ELSE sort#94160399 END AS sort#94160419, CASE WHEN (description#94160400 = null) THEN null ELSE description#94160400 END AS description#94160423, CASE WHEN ((universe#94160401 = NA) OR (universe#94160401 = null)) THEN null ELSE cast(universe#94160401 as int) END AS universe#94160424] Input [4]: [sector_id#94160398, sort#94160399, description#94160400, universe#94160401] (15) Filter Input [3]: [sector_id#94160418, sort#94160419, description#94160423] Condition : isnotnull(sector_id#94160418) (16) BroadcastHashJoin [codegen id : 2] Left keys [1]: [sector_id#94268085] Right keys [1]: [sector_id#94160418] Join condition: None (17) Project [codegen id : 2] Output [7]: [sector_id#94268085, numcos#94268096, numdates#94268099, sort#94160419, description#94160423, universe#94268444, coverage#94268349] Input [8]: [sector_id#94268085, numcos#94268096, numdates#94268099, coverage#94268349, universe#94268444, sector_id#94160418, sort#94160419, description#94160423] (18) Exchange Input [7]: [sector_id#94268085, numcos#94268096, numdates#94268099, sort#94160419, description#94160423, universe#94268444, coverage#94268349] Arguments: rangepartitioning(sort#94160419 ASC NULLS FIRST, description#94160423 ASC NULLS FIRST, 200), ENSURE_REQUIREMENTS, [id=#7526503] (19) Sort [codegen id : 3] Input [7]: [sector_id#94268085, numcos#94268096, numdates#94268099, sort#94160419, description#94160423, universe#94268444, coverage#94268349] Arguments: [sort#94160419 ASC NULLS FIRST, description#94160423 ASC NULLS FIRST], true, 0 (20) CollectLimit Input [7]: [sector_id#94268085, numcos#94268096, numdates#94268099, sort#94160419, description#94160423, universe#94268444, coverage#94268349] Arguments: 1000000