[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#755043: Initial import needs more than 14GB of RAM



 ❦ 17 juillet 2014 12:21 +0200, Raphael Hertzog <hertzog@debian.org> :

>> I have tried to setup a local instance to contribute some patches but
>> the initial import of the database needs more than 14GB of RAM. This
>
> What command did you use for the initial import?
> Was that with the fixture distro_tracker/core/fixtures/debian-repositories.xml
> or with a custome set of repositories?

Yes, with this fixture.

> How did you evaluate that memory requirement?

Well, I got the process killed by OOM after eating all memory + swap. :)

> I know it takes several Gb but none of the machines where I did the
> initial import had so much memory so your claim seems strange
> to me.

Here is my output:

2014-07-16 18:47:22,449 INFO: UpdateRepositoriesTask Updating apt's cache
2014-07-16 18:47:44,343 INFO: UpdateRepositoriesTask Updating data from Sources files
2014-07-16 18:47:44,343 INFO: UpdateRepositoriesTask Processing Sources files of oldstable repository
2014-07-16 18:50:43,416 INFO: UpdateRepositoriesTask Processing Sources files of stable repository
2014-07-16 18:53:23,963 INFO: UpdateRepositoriesTask Processing Sources files of unstable repository
2014-07-16 18:56:33,902 INFO: UpdateRepositoriesTask Processing Sources files of exp repository
2014-07-16 18:56:44,368 INFO: UpdateRepositoriesTask Processing Sources files of testing repository
2014-07-16 18:57:50,846 INFO: UpdateRepositoriesTask Processing Sources files of old-bpo repository
2014-07-16 18:57:57,972 INFO: UpdateRepositoriesTask Processing Sources files of stable-bpo repository
2014-07-16 18:58:11,512 INFO: UpdateRepositoriesTask Processing Sources files of old-p-u repository
2014-07-16 18:58:12,382 INFO: UpdateRepositoriesTask Processing Sources files of stable-p-u repository
2014-07-16 18:58:13,073 INFO: UpdateRepositoriesTask Processing Sources files of test-p-u repository
2014-07-16 18:58:13,167 INFO: UpdateRepositoriesTask Processing Sources files of old-lts repository
2014-07-16 18:58:13,479 INFO: UpdateRepositoriesTask Processing Sources files of old-upd repository
2014-07-16 18:58:13,653 INFO: UpdateRepositoriesTask Processing Sources files of stable-upd repository
2014-07-16 18:58:13,821 INFO: UpdateRepositoriesTask Processing Sources files of old-bpo-sl repository
2014-07-16 18:58:14,224 INFO: UpdateRepositoriesTask Removing obsolete source packages
2014-07-16 18:58:14,672 INFO: UpdateRepositoriesTask Updating data from Packages files
2014-07-16 18:58:14,672 INFO: UpdateRepositoriesTask Processing Packages files of unstable repository
zsh: killed     ./manage.py tracker_update_repositories

It has been running for a several hours in "Processing Packages files of
unstable repository".

> I often only run ./manage.py tracker_update_repositories for the initial
> import that said (and not run_all_tasks).

That's what I used too.

>> Trying to limit to unstable doesn't help either.
>
> Huh. Reducing the number of repositories and packages should really
> help... I wonder what you're hitting here.

Let me try again with just unstable on amd64 and a clean database.

>> Maybe a fixture with only a thousand packages or a limit of 1000
>> packages per source would help.
>
> We should certainly try to optimize the memory consumption of the
> repository update process.
>
> I have zero experience in analysis of Python's memory usage but I believe
> that there are good tools for this.
>
> Among the packaged tools I found python-meliae and python-memprof.
> python-objgraph might also be useful. 
>
> And python 3 has tracemalloc...
> https://docs.python.org/3/library/tracemalloc.html

Maybe that would be useful but if you didn't get bothered by that, my
point was essentially that it is difficult to get started because of
this for a first-time contributor. For development, some kind of limit
or a set of small fake repositories would be useful. A pre-made sqlite3
database dump too but it may be too cumbersome to update and don't allow
to work on every parts.
-- 
panic ("No CPUs found.  System halted.\n");
        2.4.3 linux/arch/parisc/kernel/setup.c

Attachment: signature.asc
Description: PGP signature


Reply to: