Community Portal

Process dataset

Created on Thursday 6 August 2020, 10:35

Back to task list

You are not logged in

Log In Register

Please join us and let's build things, together!

Discussion and updates

This task was created by the system

Hi Aris,

Could you upload a new file here that contains the information that we usually include? What is missing is:

  • Date

  • Unit

  • Material code

CSV format would be ideal. You can simply edit the record and upload a new file instead. Have a look here for additional references and please see attached a sample CSV.

Hey Paul,
I can do that, however we'll have to think through some things. For instance, in this case we need to put the ID of the buildings (so we can link them with the shp of buildings). I can imagine that all building info (building age, height, gross floor area, etc.) should go to 3.02? Also, in the case of Melbourne we have embodied, energy/carbon/water. How do you think we should classify them?

Hi Aris,

I'm picking this up again... in response to your message:

  • Please find in the library item a new attached file ("processed data") where I have taken all data and transformed it into our format.

  • Indeed, all the additional building info should be in the shapefile itself, and this has in fact already been imported because it was detected at the time of processing. Check e.g. here.

  • For embodied materials, I would say simply list the relevant code, but in the material name type "Embodied energy", "Embodied water", etc.

Now, in addition to these points, please refine the attached file and add the right date to it, add the additional info you mentioned, and see if there's anything else to change.

Lastly, when I did a trial run I got an error because your data file lists a building which is not present in the shapefile. It could also be that it's more than one, but the script stops upon getting an error. The issue is with the building with ID "100114". Can you check on that?

Hi Paul,
I head off for holidays. I will have to pick this up when we're both back from holidays.
However, just so I understand correctly. Should I reupload another shp or another csv for the dataset?
Also, I'm not sure what you mean by checking on this building? Should I look at any irregularity for that ID in the fields? Any idea what should I be looking for?

Hi Aris,
Okay sure. You should EITHER upload a new CSV file OR upload a new SHP file. Your goal: the information in the SHP should match the information in the CSV. The information to check is the building ID. All the building IDs that you list data for in the CSV should exist in the SHP. Currently, that is not the case (see building 100114 - present in CSV, not present in SHP). So you should EITHER remove non-existing buildings from the CSV file, or add them into the SHP. Does that make sense?

Ok I found what is wrong here. So the SHP file does not have some of the items of the CSV. The problem is that when we did that study I had downloaded a file (shp) I'm not sure it exists today (on the open data platform). Instead I downloaded a file that I thought would be the same as the one we used. So perhaps what I can do is to upload the shp file we used with no URL as to where to find it and also keep the official one with a URL. What do you think?

Great Aris. And yes that sounds all good - please proceed and simply give us a shout when the right shapefile is in place and we can take this forward!

Ok done! Should I start crunching? Also, please keep in mind that within this shp you have all the necessary data to make the viz (meaning that all the necessary fields are there). No need to link it to other excel files. Let me know if I can add anything.

Great, thanks Aris!

Don't "crunch" it manually because it's too large, your session may time out. Instead let the server do this throughout the day in the background.

In terms of material stock data, remember that we need that in the right format. A shapefile is not the right format for material stock data. However, I already worked on this before and you can read in my messages above that I uploaded a file here which contains all the data I previously formatted based on a file you submitted before. Are these the data we need? If not, can you please add a different file there with the right data?

Once we have both the shapefile and the corresponding material stock data, we can make a move with the material stock stuff!

Ok great. Yes there are the right ones although the unit is kg. Should I transform it in tonnes?

Also should I change the EMP codes to the "new" ones (the ones we also added for CityLoops) which best fit construction materials? I will also add a couple of other materials.
Finally for embodied resources (water, co2, energy) should I respectively add EMP7.1
,EMP5.H and what for Energy?

  • Conversion to tonnes: convert to whatever the original unit is.
  • Use new EMP codes: yes please
  • Embodied energy: water/co2 indeed, please use those categories. Energy -- in what unit do you have that? Total KJ or something?? I guess adding EMP7.4 - Energy would make most sense... what do you think?

  • Not sure what I should do :) keep the values as is and add kg as unit, or convert to tonnes (divide by 1000 the units).

  • Ok noted

  • Ok noted for water/co2. For Energy yes it is a difficult one but yes, it might be a good idea to add 7.4. Yes the unit is Joules.

  • If the original data comes in kg, then yes please keep in kg.

Noted with the other two points

Ok done, I've added the final processed dataset. I will process it now in the system!

Task was assigned to Aristide Athanassiadis and status was changed: Open → In Progress

Hey Paul, I'm processing the file but I get this error
(Not all of your data could be properly processed. Please review the error below and upload a new file. ERROR: We could not find the space with the name: '100049')
However, I can see that this reference space is present both in the shp file and the excel I uploaded. Did I mess up something along the way?

Please send me the link to the processed shapefile to review.

Aha you're right, I forgot about this, it is still not processed so it's looking at the wrong ref spaces.
This is the shp file to be processed

Ah OK copy that. I see the shapefile was too large to auto-process and we had to remove that large file block. I think it should now process at the next round, so please check again in 6h and if you see that all 13k items are listed, then you can proceed with the material stock file.

Got it! Fingers crossed 🤞

We still have only 2200 items processed on the shapefile. I don't know whether I should tweak something to get the 13k items there?

I've been monitoring things and there is a huge import happening in the back-end... running for many hours already and still going at 100% on the server, so we gotta stay tuned, this may take a while. Don't make any changes - I'm not sure if it's this file or another that is taking up all the time but let's give it another half day or so to see.

Done Aris! The import is complete (turns out we had three maps of the Barcelona port that were mega huge and were running in the background, I'll look into sorting that out later), and the 13k items are now there! Please try to process the spreadsheet.

Hey Paul, thanks for letting me know. I've tried to process it and every time I assign the ref space (Embodied ...) and click on save & next, I see there is some activity (meaning this wheel instead of the favicon in chrome), I then see the screen MoC is updating and then it goes back to the same processing screen this time with none of the reference space selected.
Any pointers?

Yeah that likely means this dataset is too large to save in one go. But hopefully it means that everything else is ready to roll. I will take it from here and update you next week with progress. Thanks for getting the data prepared!!

Aris, quick question: what is the material named "Total"? Is this the sum of all the other individual materials that are also listed? If so, please remove it -- we are otherwise getting duplicated data in. If this is something else, please put a descriptive name e.g. "Concrete".

Yes indeed it is the sum of the materials below. Ok let me do that and I'll reupload it.

New version is up

Great, thanks Aris! In the future, remember for instance this drilldown chart that you can see here. As you can see, the system will show the total, and if you click it shows how it's broken down. This only works if you do not upload totals yourself. Otherwise the "total" will be amongst the drilldown elements (and thus effectively double the figures). May be an easy way to remember what to upload.

I'll work on this next week.

Noted Paul 👍. Looking forward to it!

Hey Paul, I don't know if you managed to have a look on this dataset? I don't know why it says
File processing error
We have tried processing this file, but have encountered an error. We could not find the space with the name: '100742'

I checked and the space exists both in the processed file and the shapefile.

Sorry Aris I haven't been able to sit down for this. When I do, I'll update you here.

Status change: In Progress → Completed