What challenges or lessons have you encountered when managing cloud costs within the context of long-term contracts, and how have you addressed them?
Another crucial point is data retention. It's easy to get surprised, especially with the increasing use of large language models and companies wanting to make their own bots. There can be terabytes of data that now need to be processed in terms of ingress and egress. Organizations need to be prepared for this and take a long-term view on how they will handle it.
And consider where the data is going and how long you're storing it, not just in the cloud but also in data centers behind the scenes that might be transferring information to places you're not aware of. Being aware of these aspects and understanding your regulatory requirements is very important.
It's like a machine. You read my mind: that was exactly what I was going to bring up. Long-term storage and archival retention policies are significant considerations, and it's not cut from the same cloth. With all of these services you have to recognize where you're putting your data, and does it actually need to be in that environment? It's worth exploring if there are other contractors or data center providers you could partner with to potentially save costs. While it might be convenient to have all your data in one place, if data hasn't changed in three or more years, you should question if it needs to stay in that environment. Do you have an opportunity to save a little bit of cost by putting it elsewhere without messing up too much of the workflow? There might be opportunities to move it elsewhere without disrupting your workflow and potentially saving some money.
Yeah, great points. One thing I would like to add is as an organization, and especially as a leader who is in charge of all these cloud costs, you want to benchmark your costs against the industry, against your competitors and see if you're actually within the realm of how much spend you are having.
For example. Rasheen mentioned large language models and data handling, and the conventional wisdom might suggest using the right stacks and APIs on various hyperscalers. For instance, you might choose TensorFlow on Google Cloud for some tasks. You might use Azure for things because it’s cheaper. But then you have the connectivity costs to consider, and expenses can add up.
You need to find out if that’s actually normal for your company within your industry vertical or if you're actually overspending compared to your competitors. Having this visibility can help you identify areas where you need to adjust your development practices, behaviors, or approaches, which all contribute to a culture of cost control. Remember that not all hyperscalers are created equal, and while you may have different versions of them, it's crucial to be pragmatic and make sure your approaches are in check.
Also you really do need to have some sort of tool or template for forecasting as you move away from physical server sizing to cloud-based computing, whether it's platform or serverless, so that from a sizing standpoint, you are getting the capacity that you need for the environments that you're standing up.