NGD Reflections

Life
Enterprise Architecture
Published

September 3, 2024

Today started with a change of plan. I’d got up expecting to follow my usual routine: breakfast, walk for 20mins or more, settle down with a coffee in front of my laptop to make progess on my sabbatical goals. Expect my enthusiasm to peter out around lunch time and spend the afternoon on odd jobs. However, I had forgotten that Charlotte has a half day on Wednesdays, so it made sence to invert my day and spend the morning pootling with her.

I extended my walk ~45mins walking through Kingsgrove, over the main road and accross the fields to the path between Ardington and Wantage. I then headed back to Wantage along that path. A very satisfying start to the day, and excellent thinkint time. I had an idea of framing a discussion on the role of Enterprise Architecture in terms of my experience back in the 2000s on the Nerc Data Grid project

Let’s get this out the door.

Back to finishing writing down why I am taking a sabbatical. I have some notes from when I made the applicaiton but they are a little spun for corporate consumption. I’m starting a brain dump but it will probably be a day or two before it is ready for publishing.

Object serialisation in Ray

I read some more about how Ray serialises code and data within a cluster. I was pleased to see ray.data has based its object store on Arrow’s Plasma Object Store. I had wondered whether anything had happened to Plasma but it appears to be well used (now forked by Ray).

The Ray serialisation docs describe how Ray uses pickle version 4 to improve serialisation of large objects and how it uses cloudpickle to enhance serialisation. Cloudpickle is able to pickle functions and classes – this would explain how Ray can serialise tasks and actors in the driver script for transmision to worker nodes. Reading between the lines in the cloudpickle README I assume that tasks and actors defined in the driver script are detected as such and serialised by value, whereas references to other code objects are serialised by reference, thus assuming the wokers have the same python libraries available.

Other odds and ends

I started looking into the differences between S3, GCS and Azure Blob storage

  • https://www.linkedin.com/pulse/four-critical-differences-between-google-cloud-storage-giorgio-regni