Hi, > As the others have mentioned, yes, we are doing 2 major restructure: > > 1) base classing of our convertor classes and Pootle > 2) Locking > > Some of this does affect backend separation. > > There of course is lots of work, but I understand that yours needs to > be quite independent. We have been discussing the concept of change > queues which relates to locking ie we don't want to step on someone > else's change if for some reason you had yours out for too long. > Queues would also allow us to create a distributed environment. This > direction could be the most fruitful for your work. > > But as the others have said #pootle is a good place to ask. I have a strong opinion about the direction Pootle's backend should be headed. I think that at the moment you have a 'loose' system based on files, which is simple and transparent. However, it is inefficient in memory usage, speed and ease of distribution. Since memory usage depends on the database size, I take it that you are using some sort of memory cache to speed up the system. I think that an obvious solution here is to use a relational database (I would suggest PostgreSQL). Unlike ordinary files, it allows extremely speedy random writes which is exactly what we need here. I would expect the problem of high memory usage to disappear completely too. In fact, if we can put all important data on the database, distribution of the system would then become trivial -- several instances of the application (possibly on different computers) would simply use a single instance of the database. I would say that by doing everything (indexing, locking, etc.) manually we're reinventing the wheel, badly. I also think that using XLIFF, an XML format, for the backend is a bad idea. I think that XML is great for serializing data and sharing it between completely disparate systems, but it's awful for random writes and places where performance is important (such as the data storage backend for a heavily used system). Nobody cares whether the backend storage is compliant with some standard, it's only the interface where standards-compliance matters. The backend must simply be as efficient as possible and not get in the way. I do not mean to thrash your design decisions or stall work on the backend. Files as backend are great for small projects where performance is unimportant, because then you don't need to set up an SQL server. I just want to suggest designing the API in such a way that does not depend on files, i.e., such that a relational database would not be too much trouble to plug in. I am concerned with this issue because my SoC project is for Debian, not for Pootle directly, so I will need a backend that would handle *all* Debian translations at the same time. That would mean gigabytes of data -- a "small" relational database by modern standards, but seemingly infeasible with the current structure because of huge resource usage. And I need it by the end of summer ;) Obviously I will be working on this, but I would like to make sure in advance that you will not be working in the opposite direction. During the discussion with Aigars in Riga he seemed to agree with the idea of hooking Pootle to a relational database. I would be happy to hear your thoughts. I hope the letter did not come out too harsh. There may be more options here, or you may have some plans that I am simply not aware of. However, this is critical for my work for Debian and I want to cover this ASAP. Best regards, -- Gintautas Miliauskas http://gintasm.blogspot.com
Attachment:
signature.asc
Description: PGP signature