== Physical Plan == CollectLimit (20) +- InMemoryTableScan (1) +- InMemoryRelation (2) +- * Sort (19) +- Exchange (18) +- * Project (17) +- * BroadcastHashJoin Inner BuildRight (16) :- * Project (9) : +- * Filter (8) : +- * ColumnarToRow (7) : +- InMemoryTableScan (3) : +- InMemoryRelation (4) : +- * Project (6) : +- Scan csv (5) +- BroadcastExchange (15) +- * Filter (14) +- InMemoryTableScan (10) +- InMemoryRelation (11) +- * Project (13) +- Scan csv (12) (1) InMemoryTableScan Output [7]: [sector_id#94296502, numcos#94296517, numdates#94296518, sort#94160419, description#94160423, universe#94296863, coverage#94296727] Arguments: [sector_id#94296502, numcos#94296517, numdates#94296518, sort#94160419, description#94160423, universe#94296863, coverage#94296727] (2) InMemoryRelation Arguments: [sector_id#94296502, numcos#94296517, numdates#94296518, sort#94160419, description#94160423, universe#94296863, coverage#94296727], CachedRDDBuilder(org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer@208e3fd9,StorageLevel(disk, memory, deserialized, 1 replicas),*(3) Sort [sort#94160419 ASC NULLS FIRST, description#94160423 ASC NULLS FIRST], true, 0 +- Exchange rangepartitioning(sort#94160419 ASC NULLS FIRST, description#94160423 ASC NULLS FIRST, 200), ENSURE_REQUIREMENTS, [id=#7528808] +- *(2) Project [sector_id#94296502, numcos#94296517, numdates#94296518, sort#94160419, description#94160423, universe#94296863, coverage#94296727] +- *(2) BroadcastHashJoin [sector_id#94296502], [sector_id#94160418], Inner, BuildRight, false :- *(2) Project [sector_id#94296502, numcos#94296517, numdates#94296518, coverage#94296727, round((cast(numcos#94296517 as double) / cast(coverage#94296727 as double)), 0) AS universe#94296863] : +- *(2) Filter isnotnull(sector_id#94296502) : +- *(2) ColumnarToRow : +- InMemoryTableScan [coverage#94296727, numcos#94296517, numdates#94296518, sector_id#94296502], [isnotnull(sector_id#94296502)] : +- InMemoryRelation [sector_id#94296502, retIC#94296503, resretIC#94296516, numcos#94296517, numdates#94296518, annual_bmret#94296519, annual_ret#94296520, std_ret#94296522, Sharpe_ret#94296524, PctPos_ret#94296526, TR_ret#94296528, IR_ret#94296530, annual_resret#94296533, std_resret#94296561, Sharpe_resret#94296589, PctPos_resret#94296646, TR_resret#94296648, IR_resret#94296650, annual_retnet#94296652, std_retnet#94296654, Sharpe_retnet#94296668, PctPos_retnet#94296683, TR_retnet#94296696, IR_retnet#94296699, ... 2 more fields], StorageLevel(disk, memory, deserialized, 1 replicas) : +- *(1) Project [CASE WHEN ((sector_id#94296264 = NA) OR (sector_id#94296264 = null)) THEN null ELSE cast(sector_id#94296264 as int) END AS sector_id#94296502, CASE WHEN ((retIC#94296265 = NA) OR (retIC#94296265 = null)) THEN null ELSE cast(retIC#94296265 as float) END AS retIC#94296503, CASE WHEN ((resretIC#94296266 = NA) OR (resretIC#94296266 = null)) THEN null ELSE cast(resretIC#94296266 as float) END AS resretIC#94296516, CASE WHEN ((numcos#94296267 = NA) OR (numcos#94296267 = null)) THEN null ELSE cast(numcos#94296267 as float) END AS numcos#94296517, CASE WHEN ((numdates#94296268 = NA) OR (numdates#94296268 = null)) THEN null ELSE cast(numdates#94296268 as int) END AS numdates#94296518, CASE WHEN ((annual_bmret#94296269 = NA) OR (annual_bmret#94296269 = null)) THEN null ELSE cast(annual_bmret#94296269 as float) END AS annual_bmret#94296519, CASE WHEN ((annual_ret#94296270 = NA) OR (annual_ret#94296270 = null)) THEN null ELSE cast(annual_ret#94296270 as float) END AS annual_ret#94296520, CASE WHEN ((std_ret#94296271 = NA) OR (std_ret#94296271 = null)) THEN null ELSE cast(std_ret#94296271 as float) END AS std_ret#94296522, CASE WHEN ((Sharpe_ret#94296272 = NA) OR (Sharpe_ret#94296272 = null)) THEN null ELSE cast(Sharpe_ret#94296272 as float) END AS Sharpe_ret#94296524, CASE WHEN ((PctPos_ret#94296273 = NA) OR (PctPos_ret#94296273 = null)) THEN null ELSE cast(PctPos_ret#94296273 as float) END AS PctPos_ret#94296526, CASE WHEN ((TR_ret#94296274 = NA) OR (TR_ret#94296274 = null)) THEN null ELSE cast(TR_ret#94296274 as float) END AS TR_ret#94296528, CASE WHEN ((IR_ret#94296275 = NA) OR (IR_ret#94296275 = null)) THEN null ELSE cast(IR_ret#94296275 as float) END AS IR_ret#94296530, CASE WHEN ((annual_resret#94296276 = NA) OR (annual_resret#94296276 = null)) THEN null ELSE cast(annual_resret#94296276 as float) END AS annual_resret#94296533, CASE WHEN ((std_resret#94296277 = NA) OR (std_resret#94296277 = null)) THEN null ELSE cast(std_resret#94296277 as float) END AS std_resret#94296561, CASE WHEN ((Sharpe_resret#94296278 = NA) OR (Sharpe_resret#94296278 = null)) THEN null ELSE cast(Sharpe_resret#94296278 as float) END AS Sharpe_resret#94296589, CASE WHEN ((PctPos_resret#94296279 = NA) OR (PctPos_resret#94296279 = null)) THEN null ELSE cast(PctPos_resret#94296279 as float) END AS PctPos_resret#94296646, CASE WHEN ((TR_resret#94296280 = NA) OR (TR_resret#94296280 = null)) THEN null ELSE cast(TR_resret#94296280 as float) END AS TR_resret#94296648, CASE WHEN ((IR_resret#94296281 = NA) OR (IR_resret#94296281 = null)) THEN null ELSE cast(IR_resret#94296281 as float) END AS IR_resret#94296650, CASE WHEN ((annual_retnet#94296282 = NA) OR (annual_retnet#94296282 = null)) THEN null ELSE cast(annual_retnet#94296282 as float) END AS annual_retnet#94296652, CASE WHEN ((std_retnet#94296283 = NA) OR (std_retnet#94296283 = null)) THEN null ELSE cast(std_retnet#94296283 as float) END AS std_retnet#94296654, CASE WHEN ((Sharpe_retnet#94296284 = NA) OR (Sharpe_retnet#94296284 = null)) THEN null ELSE cast(Sharpe_retnet#94296284 as float) END AS Sharpe_retnet#94296668, CASE WHEN ((PctPos_retnet#94296285 = NA) OR (PctPos_retnet#94296285 = null)) THEN null ELSE cast(PctPos_retnet#94296285 as float) END AS PctPos_retnet#94296683, CASE WHEN ((TR_retnet#94296286 = NA) OR (TR_retnet#94296286 = null)) THEN null ELSE cast(TR_retnet#94296286 as float) END AS TR_retnet#94296696, CASE WHEN ((IR_retnet#94296287 = NA) OR (IR_retnet#94296287 = null)) THEN null ELSE cast(IR_retnet#94296287 as float) END AS IR_retnet#94296699, ... 2 more fields] : +- FileScan csv [sector_id#94296264,retIC#94296265,resretIC#94296266,numcos#94296267,numdates#94296268,annual_bmret#94296269,annual_ret#94296270,std_ret#94296271,Sharpe_ret#94296272,PctPos_ret#94296273,TR_ret#94296274,IR_ret#94296275,annual_resret#94296276,std_resret#94296277,Sharpe_resret#94296278,PctPos_resret#94296279,TR_resret#94296280,IR_resret#94296281,annual_retnet#94296282,std_retnet#94296283,Sharpe_retnet#94296284,PctPos_retnet#94296285,TR_retnet#94296286,IR_retnet#94296287,... 2 more fields] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/output/esg_innovation/innovat..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<sector_id:string,retIC:string,resretIC:string,numcos:string,numdates:string,annual_bmret:s... +- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#7528803] +- *(1) Filter isnotnull(sector_id#94160418) +- InMemoryTableScan [sector_id#94160418, sort#94160419, description#94160423], [isnotnull(sector_id#94160418)] +- InMemoryRelation [sector_id#94160418, sort#94160419, description#94160423, universe#94160424], StorageLevel(disk, memory, deserialized, 1 replicas) +- *(1) Project [CASE WHEN ((sector_id#94160398 = NA) OR (sector_id#94160398 = null)) THEN null ELSE cast(sector_id#94160398 as int) END AS sector_id#94160418, CASE WHEN (sort#94160399 = null) THEN null ELSE sort#94160399 END AS sort#94160419, CASE WHEN (description#94160400 = null) THEN null ELSE description#94160400 END AS description#94160423, CASE WHEN ((universe#94160401 = NA) OR (universe#94160401 = null)) THEN null ELSE cast(universe#94160401 as int) END AS universe#94160424] +- FileScan csv [sector_id#94160398,sort#94160399,description#94160400,universe#94160401] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/curate/curate_sector.csv], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<sector_id:string,sort:string,description:string,universe:string> ,None), [sort#94160419 ASC NULLS FIRST, description#94160423 ASC NULLS FIRST] (3) InMemoryTableScan Output [4]: [coverage#94296727, numcos#94296517, numdates#94296518, sector_id#94296502] Arguments: [coverage#94296727, numcos#94296517, numdates#94296518, sector_id#94296502], [isnotnull(sector_id#94296502)] (4) InMemoryRelation Arguments: [sector_id#94296502, retIC#94296503, resretIC#94296516, numcos#94296517, numdates#94296518, annual_bmret#94296519, annual_ret#94296520, std_ret#94296522, Sharpe_ret#94296524, PctPos_ret#94296526, TR_ret#94296528, IR_ret#94296530, annual_resret#94296533, std_resret#94296561, Sharpe_resret#94296589, PctPos_resret#94296646, TR_resret#94296648, IR_resret#94296650, annual_retnet#94296652, std_retnet#94296654, Sharpe_retnet#94296668, PctPos_retnet#94296683, TR_retnet#94296696, IR_retnet#94296699, ... 2 more fields], CachedRDDBuilder(org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer@208e3fd9,StorageLevel(disk, memory, deserialized, 1 replicas),*(1) Project [CASE WHEN ((sector_id#94296264 = NA) OR (sector_id#94296264 = null)) THEN null ELSE cast(sector_id#94296264 as int) END AS sector_id#94296502, CASE WHEN ((retIC#94296265 = NA) OR (retIC#94296265 = null)) THEN null ELSE cast(retIC#94296265 as float) END AS retIC#94296503, CASE WHEN ((resretIC#94296266 = NA) OR (resretIC#94296266 = null)) THEN null ELSE cast(resretIC#94296266 as float) END AS resretIC#94296516, CASE WHEN ((numcos#94296267 = NA) OR (numcos#94296267 = null)) THEN null ELSE cast(numcos#94296267 as float) END AS numcos#94296517, CASE WHEN ((numdates#94296268 = NA) OR (numdates#94296268 = null)) THEN null ELSE cast(numdates#94296268 as int) END AS numdates#94296518, CASE WHEN ((annual_bmret#94296269 = NA) OR (annual_bmret#94296269 = null)) THEN null ELSE cast(annual_bmret#94296269 as float) END AS annual_bmret#94296519, CASE WHEN ((annual_ret#94296270 = NA) OR (annual_ret#94296270 = null)) THEN null ELSE cast(annual_ret#94296270 as float) END AS annual_ret#94296520, CASE WHEN ((std_ret#94296271 = NA) OR (std_ret#94296271 = null)) THEN null ELSE cast(std_ret#94296271 as float) END AS std_ret#94296522, CASE WHEN ((Sharpe_ret#94296272 = NA) OR (Sharpe_ret#94296272 = null)) THEN null ELSE cast(Sharpe_ret#94296272 as float) END AS Sharpe_ret#94296524, CASE WHEN ((PctPos_ret#94296273 = NA) OR (PctPos_ret#94296273 = null)) THEN null ELSE cast(PctPos_ret#94296273 as float) END AS PctPos_ret#94296526, CASE WHEN ((TR_ret#94296274 = NA) OR (TR_ret#94296274 = null)) THEN null ELSE cast(TR_ret#94296274 as float) END AS TR_ret#94296528, CASE WHEN ((IR_ret#94296275 = NA) OR (IR_ret#94296275 = null)) THEN null ELSE cast(IR_ret#94296275 as float) END AS IR_ret#94296530, CASE WHEN ((annual_resret#94296276 = NA) OR (annual_resret#94296276 = null)) THEN null ELSE cast(annual_resret#94296276 as float) END AS annual_resret#94296533, CASE WHEN ((std_resret#94296277 = NA) OR (std_resret#94296277 = null)) THEN null ELSE cast(std_resret#94296277 as float) END AS std_resret#94296561, CASE WHEN ((Sharpe_resret#94296278 = NA) OR (Sharpe_resret#94296278 = null)) THEN null ELSE cast(Sharpe_resret#94296278 as float) END AS Sharpe_resret#94296589, CASE WHEN ((PctPos_resret#94296279 = NA) OR (PctPos_resret#94296279 = null)) THEN null ELSE cast(PctPos_resret#94296279 as float) END AS PctPos_resret#94296646, CASE WHEN ((TR_resret#94296280 = NA) OR (TR_resret#94296280 = null)) THEN null ELSE cast(TR_resret#94296280 as float) END AS TR_resret#94296648, CASE WHEN ((IR_resret#94296281 = NA) OR (IR_resret#94296281 = null)) THEN null ELSE cast(IR_resret#94296281 as float) END AS IR_resret#94296650, CASE WHEN ((annual_retnet#94296282 = NA) OR (annual_retnet#94296282 = null)) THEN null ELSE cast(annual_retnet#94296282 as float) END AS annual_retnet#94296652, CASE WHEN ((std_retnet#94296283 = NA) OR (std_retnet#94296283 = null)) THEN null ELSE cast(std_retnet#94296283 as float) END AS std_retnet#94296654, CASE WHEN ((Sharpe_retnet#94296284 = NA) OR (Sharpe_retnet#94296284 = null)) THEN null ELSE cast(Sharpe_retnet#94296284 as float) END AS Sharpe_retnet#94296668, CASE WHEN ((PctPos_retnet#94296285 = NA) OR (PctPos_retnet#94296285 = null)) THEN null ELSE cast(PctPos_retnet#94296285 as float) END AS PctPos_retnet#94296683, CASE WHEN ((TR_retnet#94296286 = NA) OR (TR_retnet#94296286 = null)) THEN null ELSE cast(TR_retnet#94296286 as float) END AS TR_retnet#94296696, CASE WHEN ((IR_retnet#94296287 = NA) OR (IR_retnet#94296287 = null)) THEN null ELSE cast(IR_retnet#94296287 as float) END AS IR_retnet#94296699, ... 2 more fields] +- FileScan csv [sector_id#94296264,retIC#94296265,resretIC#94296266,numcos#94296267,numdates#94296268,annual_bmret#94296269,annual_ret#94296270,std_ret#94296271,Sharpe_ret#94296272,PctPos_ret#94296273,TR_ret#94296274,IR_ret#94296275,annual_resret#94296276,std_resret#94296277,Sharpe_resret#94296278,PctPos_resret#94296279,TR_resret#94296280,IR_resret#94296281,annual_retnet#94296282,std_retnet#94296283,Sharpe_retnet#94296284,PctPos_retnet#94296285,TR_retnet#94296286,IR_retnet#94296287,... 2 more fields] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/output/esg_innovation/innovat..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<sector_id:string,retIC:string,resretIC:string,numcos:string,numdates:string,annual_bmret:s... ,None) (5) Scan csv Output [26]: [sector_id#94296264, retIC#94296265, resretIC#94296266, numcos#94296267, numdates#94296268, annual_bmret#94296269, annual_ret#94296270, std_ret#94296271, Sharpe_ret#94296272, PctPos_ret#94296273, TR_ret#94296274, IR_ret#94296275, annual_resret#94296276, std_resret#94296277, Sharpe_resret#94296278, PctPos_resret#94296279, TR_resret#94296280, IR_resret#94296281, annual_retnet#94296282, std_retnet#94296283, Sharpe_retnet#94296284, PctPos_retnet#94296285, TR_retnet#94296286, IR_retnet#94296287, turnover#94296288, coverage#94296289] Batched: false Location: InMemoryFileIndex [file:/srv/plusamp/data/default/ea-market/output/esg_innovation/innovation/stats_sector_id.csv] ReadSchema: struct<sector_id:string,retIC:string,resretIC:string,numcos:string,numdates:string,annual_bmret:string,annual_ret:string,std_ret:string,Sharpe_ret:string,PctPos_ret:string,TR_ret:string,IR_ret:string,annual_resret:string,std_resret:string,Sharpe_resret:string,PctPos_resret:string,TR_resret:string,IR_resret:string,annual_retnet:string,std_retnet:string,Sharpe_retnet:string,PctPos_retnet:string,TR_retnet:string,IR_retnet:string,turnover:string,coverage:string> (6) Project [codegen id : 1] Output [26]: [CASE WHEN ((sector_id#94296264 = NA) OR (sector_id#94296264 = null)) THEN null ELSE cast(sector_id#94296264 as int) END AS sector_id#94296502, CASE WHEN ((retIC#94296265 = NA) OR (retIC#94296265 = null)) THEN null ELSE cast(retIC#94296265 as float) END AS retIC#94296503, CASE WHEN ((resretIC#94296266 = NA) OR (resretIC#94296266 = null)) THEN null ELSE cast(resretIC#94296266 as float) END AS resretIC#94296516, CASE WHEN ((numcos#94296267 = NA) OR (numcos#94296267 = null)) THEN null ELSE cast(numcos#94296267 as float) END AS numcos#94296517, CASE WHEN ((numdates#94296268 = NA) OR (numdates#94296268 = null)) THEN null ELSE cast(numdates#94296268 as int) END AS numdates#94296518, CASE WHEN ((annual_bmret#94296269 = NA) OR (annual_bmret#94296269 = null)) THEN null ELSE cast(annual_bmret#94296269 as float) END AS annual_bmret#94296519, CASE WHEN ((annual_ret#94296270 = NA) OR (annual_ret#94296270 = null)) THEN null ELSE cast(annual_ret#94296270 as float) END AS annual_ret#94296520, CASE WHEN ((std_ret#94296271 = NA) OR (std_ret#94296271 = null)) THEN null ELSE cast(std_ret#94296271 as float) END AS std_ret#94296522, CASE WHEN ((Sharpe_ret#94296272 = NA) OR (Sharpe_ret#94296272 = null)) THEN null ELSE cast(Sharpe_ret#94296272 as float) END AS Sharpe_ret#94296524, CASE WHEN ((PctPos_ret#94296273 = NA) OR (PctPos_ret#94296273 = null)) THEN null ELSE cast(PctPos_ret#94296273 as float) END AS PctPos_ret#94296526, CASE WHEN ((TR_ret#94296274 = NA) OR (TR_ret#94296274 = null)) THEN null ELSE cast(TR_ret#94296274 as float) END AS TR_ret#94296528, CASE WHEN ((IR_ret#94296275 = NA) OR (IR_ret#94296275 = null)) THEN null ELSE cast(IR_ret#94296275 as float) END AS IR_ret#94296530, CASE WHEN ((annual_resret#94296276 = NA) OR (annual_resret#94296276 = null)) THEN null ELSE cast(annual_resret#94296276 as float) END AS annual_resret#94296533, CASE WHEN ((std_resret#94296277 = NA) OR (std_resret#94296277 = null)) THEN null ELSE cast(std_resret#94296277 as float) END AS std_resret#94296561, CASE WHEN ((Sharpe_resret#94296278 = NA) OR (Sharpe_resret#94296278 = null)) THEN null ELSE cast(Sharpe_resret#94296278 as float) END AS Sharpe_resret#94296589, CASE WHEN ((PctPos_resret#94296279 = NA) OR (PctPos_resret#94296279 = null)) THEN null ELSE cast(PctPos_resret#94296279 as float) END AS PctPos_resret#94296646, CASE WHEN ((TR_resret#94296280 = NA) OR (TR_resret#94296280 = null)) THEN null ELSE cast(TR_resret#94296280 as float) END AS TR_resret#94296648, CASE WHEN ((IR_resret#94296281 = NA) OR (IR_resret#94296281 = null)) THEN null ELSE cast(IR_resret#94296281 as float) END AS IR_resret#94296650, CASE WHEN ((annual_retnet#94296282 = NA) OR (annual_retnet#94296282 = null)) THEN null ELSE cast(annual_retnet#94296282 as float) END AS annual_retnet#94296652, CASE WHEN ((std_retnet#94296283 = NA) OR (std_retnet#94296283 = null)) THEN null ELSE cast(std_retnet#94296283 as float) END AS std_retnet#94296654, CASE WHEN ((Sharpe_retnet#94296284 = NA) OR (Sharpe_retnet#94296284 = null)) THEN null ELSE cast(Sharpe_retnet#94296284 as float) END AS Sharpe_retnet#94296668, CASE WHEN ((PctPos_retnet#94296285 = NA) OR (PctPos_retnet#94296285 = null)) THEN null ELSE cast(PctPos_retnet#94296285 as float) END AS PctPos_retnet#94296683, CASE WHEN ((TR_retnet#94296286 = NA) OR (TR_retnet#94296286 = null)) THEN null ELSE cast(TR_retnet#94296286 as float) END AS TR_retnet#94296696, CASE WHEN ((IR_retnet#94296287 = NA) OR (IR_retnet#94296287 = null)) THEN null ELSE cast(IR_retnet#94296287 as float) END AS IR_retnet#94296699, CASE WHEN ((turnover#94296288 = NA) OR (turnover#94296288 = null)) THEN null ELSE cast(turnover#94296288 as float) END AS turnover#94296713, CASE WHEN ((coverage#94296289 = NA) OR (coverage#94296289 = null)) THEN null ELSE cast(coverage#94296289 as float) END AS coverage#94296727] Input [26]: [sector_id#94296264, retIC#94296265, resretIC#94296266, numcos#94296267, numdates#94296268, annual_bmret#94296269, annual_ret#94296270, std_ret#94296271, Sharpe_ret#94296272, PctPos_ret#94296273, TR_ret#94296274, IR_ret#94296275, annual_resret#94296276, std_resret#94296277, Sharpe_resret#94296278, PctPos_resret#94296279, TR_resret#94296280, IR_resret#94296281, annual_retnet#94296282, std_retnet#94296283, Sharpe_retnet#94296284, PctPos_retnet#94296285, TR_retnet#94296286, IR_retnet#94296287, turnover#94296288, coverage#94296289] (7) ColumnarToRow [codegen id : 2] Input [4]: [coverage#94296727, numcos#94296517, numdates#94296518, sector_id#94296502] (8) Filter [codegen id : 2] Input [4]: [coverage#94296727, numcos#94296517, numdates#94296518, sector_id#94296502] Condition : isnotnull(sector_id#94296502) (9) Project [codegen id : 2] Output [5]: [sector_id#94296502, numcos#94296517, numdates#94296518, coverage#94296727, round((cast(numcos#94296517 as double) / cast(coverage#94296727 as double)), 0) AS universe#94296863] Input [4]: [coverage#94296727, numcos#94296517, numdates#94296518, sector_id#94296502] (10) InMemoryTableScan Output [3]: [sector_id#94160418, sort#94160419, description#94160423] Arguments: [sector_id#94160418, sort#94160419, description#94160423], [isnotnull(sector_id#94160418)] (11) InMemoryRelation Arguments: [sector_id#94160418, sort#94160419, description#94160423, universe#94160424], CachedRDDBuilder(org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer@208e3fd9,StorageLevel(disk, memory, deserialized, 1 replicas),*(1) Project [CASE WHEN ((sector_id#94160398 = NA) OR (sector_id#94160398 = null)) THEN null ELSE cast(sector_id#94160398 as int) END AS sector_id#94160418, CASE WHEN (sort#94160399 = null) THEN null ELSE sort#94160399 END AS sort#94160419, CASE WHEN (description#94160400 = null) THEN null ELSE description#94160400 END AS description#94160423, CASE WHEN ((universe#94160401 = NA) OR (universe#94160401 = null)) THEN null ELSE cast(universe#94160401 as int) END AS universe#94160424] +- FileScan csv [sector_id#94160398,sort#94160399,description#94160400,universe#94160401] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/curate/curate_sector.csv], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<sector_id:string,sort:string,description:string,universe:string> ,None) (12) Scan csv Output [4]: [sector_id#94160398, sort#94160399, description#94160400, universe#94160401] Batched: false Location: InMemoryFileIndex [file:/srv/plusamp/data/default/ea-market/curate/curate_sector.csv] ReadSchema: struct<sector_id:string,sort:string,description:string,universe:string> (13) Project [codegen id : 1] Output [4]: [CASE WHEN ((sector_id#94160398 = NA) OR (sector_id#94160398 = null)) THEN null ELSE cast(sector_id#94160398 as int) END AS sector_id#94160418, CASE WHEN (sort#94160399 = null) THEN null ELSE sort#94160399 END AS sort#94160419, CASE WHEN (description#94160400 = null) THEN null ELSE description#94160400 END AS description#94160423, CASE WHEN ((universe#94160401 = NA) OR (universe#94160401 = null)) THEN null ELSE cast(universe#94160401 as int) END AS universe#94160424] Input [4]: [sector_id#94160398, sort#94160399, description#94160400, universe#94160401] (14) Filter [codegen id : 1] Input [3]: [sector_id#94160418, sort#94160419, description#94160423] Condition : isnotnull(sector_id#94160418) (15) BroadcastExchange Input [3]: [sector_id#94160418, sort#94160419, description#94160423] Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#7528803] (16) BroadcastHashJoin [codegen id : 2] Left keys [1]: [sector_id#94296502] Right keys [1]: [sector_id#94160418] Join condition: None (17) Project [codegen id : 2] Output [7]: [sector_id#94296502, numcos#94296517, numdates#94296518, sort#94160419, description#94160423, universe#94296863, coverage#94296727] Input [8]: [sector_id#94296502, numcos#94296517, numdates#94296518, coverage#94296727, universe#94296863, sector_id#94160418, sort#94160419, description#94160423] (18) Exchange Input [7]: [sector_id#94296502, numcos#94296517, numdates#94296518, sort#94160419, description#94160423, universe#94296863, coverage#94296727] Arguments: rangepartitioning(sort#94160419 ASC NULLS FIRST, description#94160423 ASC NULLS FIRST, 200), ENSURE_REQUIREMENTS, [id=#7528808] (19) Sort [codegen id : 3] Input [7]: [sector_id#94296502, numcos#94296517, numdates#94296518, sort#94160419, description#94160423, universe#94296863, coverage#94296727] Arguments: [sort#94160419 ASC NULLS FIRST, description#94160423 ASC NULLS FIRST], true, 0 (20) CollectLimit Input [7]: [sector_id#94296502, numcos#94296517, numdates#94296518, sort#94160419, description#94160423, universe#94296863, coverage#94296727] Arguments: 1000000