Winter

Making an infinite world with Falling Sand part 2

Welcome back to the sand series, this post directly follows from what we did last time, so if you missed that, here's the link. In this post we'll first split the world into chunks, then look at some ways to speed it up.

Splitting the world into chunks #

The main feature that chunks allow for is expanding the world quickly. Let's look at the two extreme solutions, and then why chunks are the obvious way to go.

If we store the world in a single list, whenever the player reaches the edge, the whole list needs to be reallocated. This would be fine for smaller worlds, but with bigger worlds becomes impractical. Another approach would be to store each cell in a map. This would allow for an expanding world but is crippled by very slow iteration caused by the memory being scattered all over the place.

The solution comes from mixing the two together. If we store small lists in a map, we can have fast iteration, while also being able to expand the world without copying anything. These lists will act as our chunks, and if each one is a fixed size, some simple math can find the corresponding one from a world coordinate. Whenever the player wants to expand the world, we just add more chunks into the map. This only takes minimal work to implement. Each chunk is just like a small world, all we need to account for is that a particle could move from one chunk to another.

We ended last time with a SandWorld class that contained some simple functions for getting, setting, and moving cells. Let's copy those, along with the member variables to a new class called SandChunk.

There are only few changes we need to make. First, we'll add member variables for the chunk's world position, and edit GetIndex and InBounds to account for these. We'll also need to change all the functions that take x and y from unsigned to signed ints, as world coordinates contain negative numbers. Originally, I had tried to convert from world coordinates to chunk coordinates in the world, but that was a headache, this way there is only one coordinate system all the way down.

SandChunk.h

class SandChunk {
public:
	const int m_width, m_height;
	const int m_x, m_y;
private:
	Cell* m_cells;
	std::vector<std::pair<size_t, size_t>> m_changes; // destination, source
 
public:
	SandChunk(size_t width, size_t height, int x, int y)
		: m_width  (width)
		, m_height (height)
		, m_x      (x)
		, m_y      (y)
	{
		m_cells = new Cell[width * height];
	}
 
	~SandChunk() {
		delete[] m_cells;
	}
 
	size_t GetIndex(int x, int y) {
		return (x - m_x)
		     + (y - m_y) * m_width;
	}
 
	bool InBounds(int x, int y) {
		return x >= m_x && x < m_x + m_width
		    && y >= m_y && y < m_y + m_height;
	}
 
	bool IsEmpty(int x, int y) { return IsEmpty(GetIndex(x, y)); }
	bool IsEmpty(size_t index) { return GetCell(index).Type == CellType::EMPTY; }
 
 
	Cell& GetCell(int x, int y) { return GetCell(GetIndex(x, y)); }
	Cell& GetCell(size_t index) { return m_cells[index]; }
 
	void SetCell(int x, int y, const Cell& cell) { SetCell(GetIndex(x, y), cell); }
	void SetCell(size_t index, const Cell& cell) { m_cells[index] = cell; }

The second edit we'll make allows for particles to move from one chunk to another. Let's edit the m_changes list to store tuples and add a SandChunk* along with the indices from before.

SandChunk.h

	std::vector<std::tuple<SandChunk*, size_t, size_t>> m_changes; // source chunk, source, destination
	
	void MoveCell(SandChunk* source, int x, int y, int xto, int yto)
	{
		m_changes.emplace_back(
			source,
			source->GetIndex(x, y),
			GetIndex(xto, yto)
		);
	}

In CommitCells, any .first or .second needs to be changed to std::get<>. Where we set the cell's data, instead of writing directly to m_cells, we'll use GetCell and SetCell with the source chunk.

SandChunk.h

	#define _DEST(x) std::get<2>(x)
 
	void CommitCells() { 
		// remove moves that have their destinations filled
 
		for (size_t i = 0; i < m_changes.size(); i++) {
			if (!IsEmpty(_DEST(m_changes[i]))) {
				m_changes[i] = m_changes.back(); m_changes.pop_back();
				i--;
			}
		}
 
		// sort by destination
 
		std::sort(m_changes.begin(), m_changes.end(),
			[](auto& a, auto& b) { return _DEST(a) < _DEST(b); }
		);
 
		// pick random source for each destination
 
		size_t iprev = 0;
 
		m_changes.emplace_back(nullptr, -1, -1); // to catch final move
 
		for (size_t i = 0; i < m_changes.size() - 1; i++) {
			if (_DEST(m_changes[i + 1]) != _DEST(m_changes[i])) {
				size_t rand = iprev + rand_int(i - iprev);
 
				auto [chunk, src, dst] = m_changes[rand];
 
				       SetCell(dst, chunk->GetCell(src));
				chunk->SetCell(src, Cell());
 
				iprev = i + 1;
			}
		}
 
		m_changes.clear();
	}
};

Now that we have chunks that can work together, we need to coordinate them in the SandWorld. Let's replace every function from before with one that gets a chunk, then calls its respected function.

SandWorld.h

class SandWorld {
public:
	const size_t m_chunkWidth;
	const size_t m_chunkHeight;
	const double m_scale;
	
public:
	SandWorld(size_t chunkWidth, size_t chunkHeight, double scale)
		: m_chunkWidth  (chunkWidth  / scale)
		, m_chunkHeight (chunkHeight / scale)
		, m_scale       (scale)
	{}
 
	bool InBounds(int x, int y) {
		if (SandChunk* chunk = GetChunk(x, y)) {
			return chunk->InBounds(x, y);
		}
 
		return false;
	}
 
	bool IsEmpty(int x, int y) {
		return InBounds(x, y)
		    && GetChunk(x, y)->IsEmpty(x, y);
	}
 
	Cell& GetCell(int x, int y) {
		return GetChunk(x, y)->GetCell(x, y);
	}
 
	void SetCell(int x, int y, const Cell& cell) {
		if (SandChunk* chunk = GetChunk(x, y)) {
			chunk->SetCell(x, y, cell);
		}
	}
 
	void MoveCell(int x, int y, int xto, int yto) {
		if (SandChunk* src = GetChunk(x, y))
		if (SandChunk* dst = GetChunk(xto, yto)) {
			dst->MoveCell(src, x, y, xto, yto);
		}
	}

The final piece we need to add is storage for the chunks. I opted for a vector of SandChunk* for iteration, and a map to look them up by coordinate.

SandWorld.h

	std::vector<SandChunk*> m_chunks;
	std::unordered_map<std::pair<int, int>, SandChunk*, pair_hash> m_chunkLookup;

Pair hash is defined as the (first hash * #) ^ second hash

pair_hash.h

struct pair_hash {
	template<typename T1, typename T2>
	size_t operator() (const std::pair<T1, T2>& pair) const {
		return ( std::hash<T1>()(pair.first) * 0x1f1f1f1f)
			   ^ std::hash<T2>()(pair.second);
	}
};

Now that we have our containers, we can make GetChunk. First, we'll convert the world coordinates into a chunk location, then we'll get the chunk or create a new one. To create a new chunk, let's make a function called CreateChunk and pass it the location. Here, we can define the world boundaries; if the location is inside, we'll make a new chunk, add it to the containers and return.

SandWorld.h

	SandChunk* GetChunk(int x, int y) {
		auto location = GetChunkLocation(x, y);
		SandChunk* chunk = GetChunkDirect(location);
 
		return chunk ? chunk : CreateChunk(location);
	}
 
	SandChunk* GetChunkDirect(std::pair<int, int> location) {
		auto itr = m_chunkLookup.find(location);
		auto end = m_chunkLookup.end();
 
		return itr != end ? itr->second : nullptr;
	}
 
	std::pair<int, int> GetChunkLocation(int x, int y) {
		return { floor(float(x) / m_chunkWidth),
		         floor(float(y) / m_chunkHeight)};
	}
private:
	SandChunk* CreateChunk(std::pair<int, int> location) {
		auto [lx, ly] = location;
 
		if (   lx < -10 || ly < -10 
		    || lx >  10 || ly >  10) // could pass in a world limit to constructor
		{
			return nullptr;
		}
 
		SandChunk* chunk = new SandChunk(m_chunkWidth, m_chunkHeight, lx, ly);
 
		m_chunkLookup.insert({ location, chunk });
		m_chunks.push_back(chunk);
 
		return chunk;
	}

Finally, the Update function needs to be edited to iterate over the chunks.

SomeClass.h

SandWorld m_world = SandWorld(200, 200, 2); // in a class somewhere

void Update() {
	// Draw cells with mouse
	// Paste tiles

	// Update cells

	for (SandChunk* chunk : m_world.m_chunks) {
		for (size_t x = 0; x < chunk->m_width;  x++)
		for (size_t y = 0; y < chunk->m_height; y++) {
			Cell& cell = chunk->GetCell(x + y * chunk->m_width);

			int px = x + chunk->m_x;
			int py = y + chunk->m_y;

			     if (cell.Props & CellProperties::MOVE_DOWN      && MoveDown    (px, py, cell)) {}
			else if (cell.Props & CellProperties::MOVE_DOWN_SIDE && MoveDownSide(px, py, cell)) {}
			else if (cell.Props & CellProperties::MOVE_SIDE      && MoveSide    (px, py, cell)) {}
		}
	}

	for (SandChunk* chunk : m_world.m_chunks) {
		chunk->CommitCells();
	}

	// Copy sand colors to a texture
	// Draw the texture on the screen
	// Remove tiles
}

And with that, we have chunked the world. I've made the texture draw a red pixel on the boundary of each chunk so we can see them. While I'm at it, I've made the camera center around the player so now we can fly around and see chunks all around us.

Performance #

Optimization is something that never ends, so I am going to only look at a few of the most bang for buck areas we can smooth out.

Let's start with something small. Currently, every time a world function gets called, it needs to find a chunk from the map. These can't be expected to be in the cache, so hitting this map for every cell will add up to a considerable loss of time. To get around this, let's make a SandWorker class that stores a chunk from the map, then only calls back to the world if necessary. This also gives us a nice way to hook into the engine with other custom behaviors.

SandWorker.h

class SandWorker {
protected:
	SandWorld& m_world;
	SandChunk* m_chunk;
 
public:
	SandWorker(SandWorld& world, SandChunk* chunk)
		: m_world (world)
		, m_chunk (chunk)
	{}
 
	void UpdateChunk() {
		for (int x = 0; x < m_chunk->m_width;  x++)
		for (int y = 0; y < m_chunk->m_height; y++) {
			Cell& cell = m_chunk->GetCell(x + y * m_chunk->m_width);
 
			int px = x + m_chunk->m_x;
			int py = y + m_chunk->m_y;
 
			UpdateCell(px, py, cell);
		}
	}
 
	virtual void UpdateCell(int x, int y, Cell& cell) = 0;
 
	Cell& GetCell(int x, int y) {
		if (m_chunk->InBounds(x, y)) {
			return m_chunk->GetCell(x, y);
		}
 
		return m_world.GetCell(x, y);
	}
 
	void SetCell(int x, int y, const Cell& cell) {
		if (m_chunk->InBounds(x, y)) {
			return m_chunk->SetCell(x, y, cell);
		}
 
		return m_world.SetCell(x, y, cell);
	}
 
	void MoveCell(int x, int y, int xto, int yto) {
		if (   m_chunk->InBounds(x, y)
		    && m_chunk->InBounds(xto, yto))
		{
			return m_chunk->MoveCell(m_chunk, x, y, xto, yto);
		}
 
		return m_world.MoveCell(x, y, xto, yto);
	}
 
	bool InBounds(int x, int y) {
		return m_chunk->InBounds(x, y)
		    || m_world .InBounds(x, y);
	}
 
	bool IsEmpty(int x, int y) {
		if (m_chunk->InBounds(x, y)) {
			return m_chunk->IsEmpty(x, y);
		}
 
		return m_world.IsEmpty(x, y);
	}
};

Now we can override this class and put our custom cell behaviors in it.

SimpleSandWorker.h

class SimpleSandWorker : public SandWorker {
public:
	SimpleSandWorker(SandWorld& world, SandChunk* chunk) 
		: SandWorker (world, chunk) 
	{}
 
	void UpdateCell(int x, int y, Cell& cell) override {
		     if (cell.Props & CellProperties::MOVE_DOWN      && MoveDown    (x, y, cell)) {}
		else if (cell.Props & CellProperties::MOVE_DOWN_SIDE && MoveDownSide(x, y, cell)) {}
		else if (cell.Props & CellProperties::MOVE_SIDE      && MoveSide    (x, y, cell)) {}
	}
private:
	bool MoveDown    (int x, int y, const Cell& cell) { /* ... */ }
	bool MoveDownSide(int x, int y, const Cell& cell) { /* ... */ }
	bool MoveSide    (int x, int y, const Cell& cell) { /* ... */ }
};

To make use of this, let's edit the Update function. You could add a list of these in the world, but for now I'll just hard code this one.

SomeClass.h

void Update() {
	// Draw cells with mouse
	// Paste tiles

	for (SandChunk* chunk : m_world.m_chunks) {
		SimpleSandWorker(m_world, chunk).UpdateChunk();
	}

	for (SandChunk* chunk : m_world.m_chunks) {
		chunk->CommitCells();
	}

	// Copy sand colors to a texture
	// Draw the texture on the screen
	// Remove tiles
}

The biggest time sink is iterating over the cells, so anything we can do to cutdown on the number we need to check will improve the frame rate dramatically. The most obvious is that most chunks are completely empty, so we could delete those and remove them from iteration entirely.

To do this, let's add a count of filled cells in the SandChunk. When setting a cell, if the source is filled but the destination isn't, we'll increment by one. Inversely, if the destination is filled and we are setting it to empty, we'll decrement.

SandChunk.h

	size_t m_filledCellCount;
 
	void SetCell(size_t index, const Cell& cell)
	{
		Cell& dest = m_cells[index];
 
		if (   dest.Type == CellType::EMPTY
		    && cell.Type != CellType::EMPTY) // Filling a cell
		{
			m_filledCellCount++;
		}
 
		else
		if (   dest.Type != CellType::EMPTY
			&& cell.Type == CellType::EMPTY) // Removing a filled cell
		{
			m_filledCellCount--;
		}
 
		dest = cell;
	}

In the SandWorld, let's add a new function called RemoveEmptyChunks.

SandWorld.h

void RemoveEmptyChunks() {
	for (size_t i = 0; i < m_chunks.size(); i++) {
		SandChunk* chunk = m_chunks.at(i);
 
		if (chunk->m_filledCellCount == 0) {
			m_chunkLookup.erase(GetChunkLocation(chunk->m_x, chunk->m_y));
			m_chunks[i] = m_chunks.back(); m_chunks.pop_back();
			i--;
 
			delete chunk;
		}
	}
}

Then call it before the world update.

SomeClass.h

void Update() {
	// Draw cells with mouse
	// Paste tiles

	m_world.RemoveEmptyChunks();

	// Update cells

	// Copy sand colors to a texture
	// Draw the texture on the screen
	// Remove tiles
}

If we look back at the chunk visualization, there are now only chunks where there are filled cells.

However, we still iterate over the static cells like rocks. This brings us to the second optimization. If a chunk only has a single filled cell, every cell is still iterated. We could though, select only a subsection to iterate. This technique is commonly referred to as a dirty rectangle and used in UI painting to save time by not redrawing static elements.

Whenever a cell gets set or moved, this rectangle needs to expand to contain it for the next update. We can't expand this rectangle during an update, so we need to double buffer it like we did for the initial Processing version of the cells array. To implement this, let's add a min/max x and y to the SandChunk along with two functions named UpdateRect and KeepAlive.

SandChunk.h

public:
	int m_minX, m_minY,
	    m_maxX, m_maxY;   // Dirty rect
private:
	int m_minXw, m_minYw,
	    m_maxXw, m_maxYw; // Working dirty rect
 
public:
	void KeepAlive(int x, int y) {
		KeepAlive(GetIndex(x, y));
	}
 
	void KeepAlive(size_t index) {
		int x = index % m_width;
		int y = index / m_width;
 
		m_minXw = clamp(min(x - 2, m_minXw), 0, m_width);
		m_minYw = clamp(min(y - 2, m_minYw), 0, m_height);
		m_maxXw = clamp(max(x + 2, m_maxXw), 0, m_width);
		m_maxYw = clamp(max(y + 2, m_maxYw), 0, m_height);
	}
 
	void UpdateRect() {
		// Update current; reset working
		m_minX = m_minXw;  m_minXw = m_width;
		m_minY = m_minYw;  m_minYw = m_height;
		m_maxX = m_maxXw;  m_maxXw = -1;
		m_maxY = m_maxYw;  m_maxYw = -1;	
	}

Then in SetCell, we'll pass the index to KeepAlive.

SandChunk.h

	void SetCell(size_t index, const Cell& cell)
	{
		// Set cell & update count

		KeepAlive(index);
	}

Because the chunks can update each other's rectangles, we need to wait for all the chunks to be committed before calling UpdateRect. To do this, let's add another loop after committing the cells in the main update.

Now, in the SandWorker's Update function, instead of iterating from 0 to the boundary, we can iterate from the min to the max of the rectangle.

SandChunk.h

	void UpdateChunk() {
		for (int x = m_chunk->m_minX; x < m_chunk->m_maxX; x++)
		for (int y = m_chunk->m_minY; y < m_chunk->m_maxY; y++) {
			Cell& cell = m_chunk->GetCell(x + y * m_chunk->m_width);
 
			int px = x + m_chunk->m_x;
			int py = y + m_chunk->m_y;
 
			UpdateCell(px, py, cell);
		}
	}

I was under the impression that this was all that was required, but if we look at the boundary of a sleeping chunk, the cells don't wake up correctly. This happens because the rectangles are bounded by the chunks and they can't talk to their neighbors.

We need a way to notify the chunks that an update has happened on their border. Let's add a KeepAlive function to the world,

SandWorld.h

	void KeepAlive(int x, int y) {
		if (SandChunk* chunk = GetChunk(x, y)) {
			chunk->KeepAlive(x, y);
		}
	}

and for now just edit MoveCell in the SandWorker to use it.

SandWorker.h

	void MoveCell(int x, int y, int xto, int yto)
	{
		int pingX = 0, pingY = 0;
 
		if (x == m_chunk->m_x)                         pingX = -1;
		if (x == m_chunk->m_x + m_chunk->m_width  - 1) pingX =  1;
		if (y == m_chunk->m_y)                         pingY = -1;
		if (y == m_chunk->m_y + m_chunk->m_height - 1) pingY =  1;
 
		if (pingX != 0)               m_world.KeepAlive(x + pingX, y);
		if (pingY != 0)               m_world.KeepAlive(x,         y + pingY);
		if (pingX != 0 && pingY != 0) m_world.KeepAlive(x + pingX, y + pingY);
 
		if (   m_chunk->InBounds(x, y)
		    && m_chunk->InBounds(xto, yto))
		{
			return m_chunk->MoveCell(m_chunk, x, y, xto, yto);
		}
 
		return m_world.MoveCell(x, y, xto, yto);
	}

Now finally, let's look at what we got.

That's a considerable speed up! These optimizations won't have much effect if the whole screen if full of moving cells though, we'll need to look to threading to alleviate that.

This made it sound easy, but in reality, it took me a while to stamp out all the bugs. If you're following along, before we get into threading, make sure this works 100% because it's going to be impossible to debug without disabling the threading part.

Threading #

Now we can get to the fun part, threading it all together. Multi-threading is tough because if you're not careful, race conditions can sneak in unexpectedly. Instead of a normal logic error that will cause a consistent problem, you could get random crashes. This makes these types of bugs hard to find if you're not sure where to look. Because of this, it's best to keep the threading code as concise as possible. Luckily for us this code is quite small, and not very interconnected, so it shouldn't be too bad.

Currently, we update each chunk one-by-one, but chunks are mostly independent from one another. This makes them perfect candidates for thread pooling!

The idea of a thread pool is to queue up a series of tasks and execute as many at a time as there are threads in the pool. In my code, I've got a thread pool variable named Task. You can checkout a simple implementation of one here

Let's edit the Update function to use this thread pool.

SomeClass.h

for (SandChunk* chunk : m_world.m_chunks) {
	Task->queue([&, chunk]() {
		SimpleSandWorker(m_world, chunk).UpdateChunk();
	});
}

And we're done, let's see a demo!

Well if only it was that simple, right?

There are many issues with this code, if we look at the execution line by line, notice that the current thread only pushes tasks onto the queue, leaving the chunk updates to finish sometime in the future. Before that can happen, we start calling CommitCells. This causes race conditions all over the place, and an inevitable crash.

To fix this we need a way to wait for all the chunks to finish. C++ provides the std:: condition_variable class, which we can use to pause a thread until a condition is met, and the std::mutex class which allows us to mark critical sections for the OS to guard. Only one thread can run critical sections protected by a specific mutex at a time which allows us to safely edit shared variables in different threads.

To make these changes we'll need three variables: one condition variable, one mutex, and a count of chunks to update. After updating the chunk, we'll lock and decrement the count. We'll use a std::unique_lock and pass it the mutex. This locks the mutex until it pops off the stack, so we can use a scope and make it a one-liner. After we get the correct count, let's tie it all together with a condition variable. These act like messengers between threads; we can send notifications to a waiting thread by calling notify_one. Calling wait, blocks the thread until it receives a notification. Once it does, and the condition is met, it continues onwards with the mutex locked.

SomeClass.h

		std::mutex mutex;
		std::condition_variable cond;
		int chunkCount = m_world.m_chunks.size();
 
		for (SandChunk* chunk : m_world.m_chunks) {
			Task->queue([&, chunk]() {
				SimpleSandWorker(m_world, chunk).UpdateChunk();
 
				{ std::unique_lock lock(mutex); chunkCount--; }
				cond.notify_one();
			});
		}
 
		{
			std::unique_lock lock(mutex);
			cond.wait(lock, [&]() { return chunkCount == 0; });
		}

We can also multithread the CommitCells function in much the same way. Before we do that, to keep the code clean, I am going to throw this loop into a lambda called doForAllChunks. Then we can make two calls to it, one for the updating and the other for the commitment.

SomeClass.h

	void Update() {
		// Draw cells with mouse
		// Paste tiles
 
		// Update cells
 
		m_world.RemoveEmptyChunks();
 
		std::mutex mutex;
		std::condition_variable cond;
 
		auto doForAllChunks = [&](std::function<void(SandChunk*)> func) {
			int chunkCount = m_world.m_chunks.size();
		
			for (SandChunk* chunk : m_world.m_chunks) {
				Task->queue([&, chunk]() {
					func(chunk);
 
					{ std::unique_lock lock(mutex); chunkCount--; }
					cond.notify_one();
				});
			}
 
			std::unique_lock lock(mutex);
			cond.wait(lock, [&]() { return chunkCount == 0; });
		};
 
		doForAllChunks([&](SandChunk* chunk) {
			SimpleSandWorker(m_world, chunk).UpdateChunk();
		});
 
		doForAllChunks([&](SandChunk* chunk) {
			chunk->CommitCells();
		});
 
		for (SandChunk* chunk : m_world.m_chunks) {
			chunk->UpdateRect();
		}
 
		// Copy sand colors to a texture
		// Draw the texture on the screen
		// Remove tiles
	}

Let's see what happens when we run it now.

It seems stable, but eventually it crashes. Why could this be? Well we didn't account for the different threads calling back to the world and into other chunks. Those other chunks could be getting updated at the same time, creating more race conditions.

We need to consider who could be calling what functions in the SandChunk and SandWorld, and make sure to guard the ones that multiple threads could call at the same time.

Let's start with the functions in the SandChunk. Multiple threads could collide inside of SetCell, MoveCell, and KeepAlive, so we'll need three mutexes.

SandChunk.h

private:
	std::mutex m_filledCellCountMutex;
	std::mutex m_changesMutex;
	std::mutex m_workingRectMutex;

In SetCell, we only need to guard the filled cell count because no two cells in the array will get written/read at the same time. This is guaranteed by the way that CommitCells works, and we won't be drawing with the mouse during the update.

SandChunk.h

	void SetCell(size_t index, const Cell& cell)
	{
		Cell& dest = m_cells[index];
 
		if (   dest.Type == CellType::EMPTY
			&& cell.Type != CellType::EMPTY)  // Filling a cell
		{
			std::unique_lock lock(m_filledCellCountMutex);
			m_filledCellCount++;
		}
 
		else 
		if (   dest.Type != CellType::EMPTY
			&& cell.Type == CellType::EMPTY) // Removing a filled cell
		{
			std::unique_lock lock(m_filledCellCountMutex);
			m_filledCellCount--;
		}
 
		dest = cell;
 
		KeepAlive(index);
	}

In MoveCell, we need to lock the whole list unfortunately, but this should only be in conflict when a chunk tries to move a particle into a neighboring chunk, which is a low percentage of the moves.

SandChunk.h

	void MoveCell(SandChunk* source, int x, int y, int xto, int yto)
	{
		size_t src = source->GetIndex(x, y);
		size_t dst = GetIndex(xto, yto);
 
		std::unique_lock lock(m_changesMutex);
 
		m_changes.emplace_back(source, src, dst);
	}

Finally, we need to also lock the KeepAlive function.

SandChunk.h

	void KeepAlive(size_t index) {
		int x = index % m_width;
		int y = index / m_width;
 
		std::unique_lock lock(m_workingRectMutex);
 
		m_minXw = clamp(min(x - 2, m_minXw), 0, m_width);
		m_minYw = clamp(min(y - 2, m_minYw), 0, m_height);
		m_maxXw = clamp(max(x + 2, m_maxXw), 0, m_width);
		m_maxYw = clamp(max(y + 2, m_maxYw), 0, m_height);
	}

In the SandWorld we could use two mutexes to guard the list and map but guarding the whole map every time it needs to be accessed will cripple the multithreaded performance. To get around this we can use a concurrent map, I saw that Microsoft provides a concurrent_unordered_map, so I just replaced the current map with it. Basically, this locks the buckets instead of the whole container, so more threads can key into it at once. We need to lock the list though, but we never use it for access besides iteration, so we only need to lock when inserting into it as multiple threads could be creating chunks at the same time.

SandWorld.h

#include <concurrent_unordered_map.h>
 
private:
	Concurrency::concurrent_unordered_map<std::pair<int, int>, SandChunk*, pair_hash> m_chunkLookup;
	std::mutex m_chunkMutex;
 
private:
	SandChunk* CreateChunk(std::pair<int, int> location) {
		auto [lx, ly] = location;
 
		if (   lx < -50 || ly < -50
			|| lx >  50 || ly >  50) // could pass in a world limit to constructor
		{
			return nullptr;
		}
 
		SandChunk* chunk = new SandChunk(m_chunkWidth, m_chunkHeight, lx, ly);
 
		m_chunkLookup.insert({ location, chunk });
 
		{
			std::unique_lock lock(m_chunkMutex);
			m_chunks.push_back(chunk);
		}
 
		return chunk;
	}

That should be all for multithreading! Let's take a look at the performance.

Look at that frame time! I think this is a good basis for a strong sand engine. Now I am going to try and build a game with it, so expect a few dev logs about that. I was hoping to make a game in a month but including engine dev time I'm already way past that, so we'll pause the clocks and I'll try and do some sort of weekly thing if I can, I think that'll be interesting. I tried making that space game I was taking about, and it was actually pretty cool, but there wasn't too much sand involved unfortunately, so I am going to try and repurpose the mechanics from it into something on the ground instead of in space.

Comments#