Slow when processing multiple insert statements

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

Slow when processing multiple insert statements

Si Carter
It might already be a known issue, which is why I didn't report it using
RFE/Bugs @ sourceforge.

FB 1.5.2 (Classic)
Windows XP
FR 0.4.0
LocalHost

Running an insert script with just over 74000 insert statements was very
slow, after about 25 mins I gave up and terminated FR.  IBOConsole, running
the same script took 3 - 4 mins.  When FR was used the task manager showed
that FR was using 98% (on average) CPU with the remainder going to all other
processes.  With IBOconsole the task manager showed approx 40% cpu to
firebird and about 30% to iboconsole.

Before running the same script with IBOConsole I did a SELECT COUNT(*) from
the table and only 12000 items had been committed by FR.

I have also added a bug report
(http://sourceforge.net/tracker/index.php?func=detail&aid=1295338&group_id=1
24340&atid=699234) which I noticed as a side effect of the slow processing.

Regards

Si Carter



-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server.
Download it for free - -and be entered to win a 42" plasma tv or your very
own Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Flamerobin-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/flamerobin-devel
Reply | Threaded
Open this post in threaded view
|

Re: Slow when processing multiple insert statements

Milan Babuskov-5
Simon Carter wrote:
> Running an insert script with just over 74000 insert statements was very
> slow, after about 25 mins I gave up and terminated FR.

Are there any commit statements in the script?

> Before running the same script with IBOConsole I did a SELECT COUNT(*) from
> the table and only 12000 items had been committed by FR.

My guess that there was a commit in script after 12000 records, and it
triggered statement processing code which got stuck somewhere. Is there
such commit statement?

Thanks,
--
Milan Babuskov
http://fbexport.sourceforge.net
http://www.flamerobin.org



-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server.
Download it for free - -and be entered to win a 42" plasma tv or your very
own Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Flamerobin-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/flamerobin-devel
Reply | Threaded
Open this post in threaded view
|

RE: Slow when processing multiple insert statements

Si Carter
 

> -----Original Message-----
> Are there any commit statements in the script?

Every 6000 records there is a commit.

Rgds

Si



-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server.
Download it for free - -and be entered to win a 42" plasma tv or your very
own Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Flamerobin-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/flamerobin-devel
Reply | Threaded
Open this post in threaded view
|

Re: Slow when processing multiple insert statements

Michael Hieke-2
In reply to this post by Milan Babuskov-5
Milan, Simon,

Milan Babuskov wrote:

> My guess that there was a commit in script after 12000 records, and
> it triggered statement processing code which got stuck somewhere.

I don't think that FR was necessarily stuck, it may be that it consumed
so much memory that most of the time was spent in the memory manager.  I
see this problem when handling large query result sets.
Simon, how much memory has the machine in question, and how much of it
was used by FR?

Thanks

--
Michael Hieke


-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server.
Download it for free - -and be entered to win a 42" plasma tv or your very
own Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Flamerobin-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/flamerobin-devel
Reply | Threaded
Open this post in threaded view
|

Re: Slow when processing multiple insert statements

Milan Babuskov-5
In reply to this post by Si Carter
Simon Carter wrote:
>>Are there any commit statements in the script?
> Every 6000 records there is a commit.

I did a test myself. I dumped 100.000 records (using fbexport in just 14
secs ;) from a table to insert statements, and tried to run the script
from FR:

1. the script is ~11MB. The sql editor wasn't too responsive anymore,
but usable.

2. I think we do a some string operations that are overkill in this
case, but that's not the main issue

3. each statement is prepared, run and logged in "statistics" window.
So, for each statement, add some 200 bytes of text. For 100.000 records,
that's ~20MB more, making it a total of 30+ MB for that text control. As
contents add up I guess is works slower and slower.

But I'm not sure what to fix here. We could add the option "not to log"
for large statement buffers. Or even better, detect it ourselves and
popup a message dialog. We can also improve some string handling, and
that's about it.

Few tips:

1. running such big scripts from a GUI tool is not a good idea. isql is
much better choice. If you don't like the isql's error control (or lack
of it to be precise), you can use FBExport or a similar command-line tool.

2. importing that much data via INSERT statements is bad idea by itself,
since each statement has to be prepared. It is more efficient to prepare
a statement once, and just feed different data as parameters. If you
wish to import data from other FB database there are tools (FBCopy,
IBDataPump) that do exactly what I described. I guess you already knew
this, but just to be sure.

--
Milan Babuskov
http://fbexport.sourceforge.net
http://www.flamerobin.org



-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server.
Download it for free - -and be entered to win a 42" plasma tv or your very
own Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Flamerobin-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/flamerobin-devel
Reply | Threaded
Open this post in threaded view
|

RE: Slow when processing multiple insert statements

Si Carter
In reply to this post by Michael Hieke-2
 

> -----Original Message-----
> I don't think that FR was necessarily stuck, it may be that
> it consumed so much memory that most of the time was spent in
> the memory manager.  I see this problem when handling large
> query result sets.

I don't think it was stuck either, just processing slowly and eating cpu as
it goes.  My guess, and I don't know FR code, is that its in a tight loop,
and not releasing the CPU, i.e. Sleep(0) on windows.

> Simon, how much memory has the machine in question, and how
> much of it was used by FR?

I'm just running the same script again, here are some stats from my pc:

Phys Mem
Total 1015280
Avail 509828
Sys Cache 401840

Kernel Memory (k)
Total 50792
Paged 41568
Non paged 9224

Commit charge 673M/2390M

CPU 98%
Mem Usage: 47,544 (incrementing by about 4k/second)
Mem Delta: flick between 4k and 8k
I/O Other is currently 42,457 and incrementing by about 63 every second

All other items in task manager are remaining more or less constant.  Ive
just checked and 6000 rows have been inserted after approx 12 minutes of cpu
time.  This was checked using iboconsole.

The script I'm using, if your interested, can be downloaded from
(http://www.tectsoft.net/download/firebird/geoip/geoip_data.zip)

Rgds

Si



-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server.
Download it for free - -and be entered to win a 42" plasma tv or your very
own Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Flamerobin-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/flamerobin-devel
Reply | Threaded
Open this post in threaded view
|

Re: Slow when processing multiple insert statements

Michael Hieke-2
Simon,

Simon Carter wrote:

> I don't think it was stuck either, just processing slowly and eating
> cpu as it goes.  My guess, and I don't know FR code, is that its in a
> tight loop, and not releasing the CPU, i.e. Sleep(0) on windows.

there are indeed no calls to Sleep() in the execution path, but even
they would not really help for long prepares and such things.  Things
should improve immensely with a threaded execution engine.  But I must
agree with Milan that running thousands of INSERTs like this will always
be suboptimal.  If the logging is a limiting factor, it will not be sped
up by the threading.

> Phys Mem
> Total 1015280
> Avail 509828
> Sys Cache 401840

> CPU 98%
> Mem Usage: 47,544 (incrementing by about 4k/second)
> Mem Delta: flick between 4k and 8k
> I/O Other is currently 42,457 and incrementing by about 63 every second

Ok, since Firebird isn't using much of the CPU, and there's little I/O
it does look indeed as if FR spends most of the time in internal
processing.  Memory reallocation, string processing, styling of text,
that kind of thing.  Maybe adding Freeze() and Thaw() for the log
control would help.  But that is only idle speculation, it needs to be
measured.  I will run this on the Apple, there is a nifty tool there
that interrupts the running code every 10 ms and logs where it is.  That
should give us an idea where this processing time goes into.

Thanks

--
Michael Hieke


-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server.
Download it for free - -and be entered to win a 42" plasma tv or your very
own Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Flamerobin-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/flamerobin-devel
Reply | Threaded
Open this post in threaded view
|

RE: Slow when processing multiple insert statements

Si Carter
In reply to this post by Milan Babuskov-5
> -----Original Message-----
> 3. each statement is prepared, run and logged in "statistics" window.
> So, for each statement, add some 200 bytes of text. For
> 100.000 records, that's ~20MB more, making it a total of 30+
> MB for that text control. As contents add up I guess is works
> slower and slower.

Could be, perhaps it might be worth having a max buffer size for the
control, to stop it getting inundated?

> 1. running such big scripts from a GUI tool is not a good
> idea. isql is much better choice. If you don't like the
> isql's error control (or lack of it to be precise), you can
> use FBExport or a similar command-line tool.

I couldn't agree more, I have used FBExport on several occasions, great tool
:-) However, in this instance it was a one off script, on my dev machine.  I
set it to run and then went for lunch, if it wasn't for an error in one of
the insert statements, which stopped it, I'd never have noticed the lag, as
I had to re-run the script after lunch.

The one thing I wasn't too keen on with FR's implementation, and I have to
be honest, I am using FR all the time now, I only jump to IBOConsole to
retrieve metadata. But the one thing I wasn't keen on is its use of the CPU,
the computer was almost bought to its knees.  Granted, this pc has a crap
processor, but I still don't like app's which hog the cpu.  In hindsight, I
should have lowered the threads priority *doh!*.

Take what I say with a pinch of salt, I fully understand that FR is in beta,
you guys are doing a great job, and every release is getting better and
better.  I'm sure that when you start profiling FR, a lot of the CPU usage
ive noticed will disappear.  Also from my POV I would rather report these
things now, than leave it and forget, if you all feel its too soon to be
reporting these types of issues, let me know.

Rgds

Si



-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server.
Download it for free - -and be entered to win a 42" plasma tv or your very
own Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Flamerobin-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/flamerobin-devel
Reply | Threaded
Open this post in threaded view
|

RE: Slow when processing multiple insert statements

Si Carter
In reply to this post by Michael Hieke-2
> -----Original Message-----
> there are indeed no calls to Sleep() in the execution path,
> but even they would not really help for long prepares and
> such things.  Things should improve immensely with a threaded
> execution engine.  

Even threaded execution's can hog a cpu, on windows, a call to Sleep(0) in
tight loops, will stop the app from taking all the cpu.

> But I must agree with Milan that running
> thousands of INSERTs like this will always be suboptimal.  If
> the logging is a limiting factor, it will not be sped up by
> the threading.

I agree fully, thousands of inserts using this type of admin tool is not
optimal, like I said in a previous message, profiling will help, but I
appreciate it doesn't come early in a projects lifecycle :-)

From my POV I would rather have the functionality in place, and worry about
the script I ran at a later date, after all, its not as if I run the script
several times a day :)

> But that is only idle speculation, it needs to be measured.  
> I will run this on the Apple, there is a nifty tool there
> that interrupts the running code every 10 ms and logs where
> it is.  That should give us an idea where this processing
> time goes into.

That sounds like a very useful utility, I have a license for AQTime
(http://www.automatedqa.com/downloads/aqtime/index.asp), I will, as soon as
time permits, run FR through it to see if it highlights anything as it gives
some nifty stats on running apps.

Rgds

Si



-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server.
Download it for free - -and be entered to win a 42" plasma tv or your very
own Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Flamerobin-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/flamerobin-devel
Reply | Threaded
Open this post in threaded view
|

Re: Slow when processing multiple insert statements

Milan Babuskov-5
In reply to this post by Si Carter
Simon Carter wrote:
> Take what I say with a pinch of salt, I fully understand that FR is in beta,

Alpha, actually. When we go "beta", you won't be using anything else ;)

> you guys are doing a great job, and every release is getting better and
> better.  I'm sure that when you start profiling FR, a lot of the CPU usage
> ive noticed will disappear.

To be honest, I mostly concentrated on features so far. FR needs to
cover few more of my own needs. I'm aware that there are a lot of space
for improvements and optimizations.

> Also from my POV I would rather report these
> things now, than leave it and forget, if you all feel its too soon to be
> reporting these types of issues, let me know.

Just keep the reports coming in. We appreciate it a lot. Better now,
than in FlameRobin 1.0.

--
Milan Babuskov
http://fbexport.sourceforge.net
http://www.flamerobin.org



-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server.
Download it for free - -and be entered to win a 42" plasma tv or your very
own Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Flamerobin-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/flamerobin-devel
Reply | Threaded
Open this post in threaded view
|

Re: Slow when processing multiple insert statements

Michael Hieke-2
In reply to this post by Michael Hieke-2
I wrote:

> But that is only idle speculation, it needs to be measured.  I will
> run this on the Apple, there is a nifty tool there that interrupts
> the running code every 10 ms and logs where it is.  That should give
> us an idea where this processing time goes into.

See attached screenshot for a screenshot of the output of FlameRobin
running under Sampler.  Logging needs to be limited, log control output
frozen for the whole batch.

Thanks

--
Michael Hieke

FR_Sampler.png (60K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Slow when processing multiple insert statements

Milan Babuskov-2
Michael Hieke wrote:
> See attached screenshot for a screenshot of the output of FlameRobin
> running under Sampler.  Logging needs to be limited, log control output
> frozen for the whole batch.

I'm not sure how to interprete these numbers? Does it mean that 4666 or
the time is spent for execute and 2494 for logging? It would mean that
1/3 of total time is spent on logging. I assumed it would be much more
(did you run a really large sql script?).

--
Milan Babuskov
http://fbexport.sourceforge.net
http://www.flamerobin.org



-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server.
Download it for free - -and be entered to win a 42" plasma tv or your very
own Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Flamerobin-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/flamerobin-devel
Reply | Threaded
Open this post in threaded view
|

Re: Slow when processing multiple insert statements

Michael Hieke-2
Milan,

Milan Babuskov wrote:

> I'm not sure how to interprete these numbers? Does it mean that 4666 or
> the time is spent for execute and 2494 for logging? It would mean that
> 1/3 of total time is spent on logging. I assumed it would be much more
> (did you run a really large sql script?).

I created a very simple table (integer primary key, timestamp), and made
10 blocks of 1000 INSERTs, followed by a COMMIT.  I waited for maybe 3
or 4 of those blocks, and I noticed that the time for each block
increased.  That's understandable, because preparing, executing etc.
takes basically always the same time, but the logging works on ever
increasing amounts of data.

So for very few INSERTs (at most 5000) I have roughly the following
distribution:

   99% ExecuteSqlFrame::parseStatements
     64% ExecuteSqlFrame::execute()
       34% ExecuteSqlFrame::log()
         18% wxStyledTextCtrl::GotoPos()
         16% wxStyledTextCtrl::AddText()
         ...
       11% StatementImpl::Prepare()
       4.4% StatementImpl::Plan()
       4.4% StatementImpl::Release()
       4.0% StatementImpl::Execute()
       2.8% StatementImpl::AffectedRows()
       ...
     32% wxStringBase::find()
     ...
   ...

For more statements the execution part will take less and less of the
time, and the housekeeping will take more and more.  Obvious points that
need changing:

- wxStyledTextCtrl::GotoPos() -> needs only be called when everything is
done, not in each log() call.
- wxStringBase::find() -> we NEED a decent parser, sooner rather than later.

The whole way FR treats statement text is not fitting for larger amounts
of text.  Comments need to be scanned over, not removed from statements.

There are enough parsers around to look at.  There are tools to create
them.  We might even simply take a parser class from Vulcan project and
be done with it.

Thanks

--
Michael Hieke


-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server.
Download it for free - -and be entered to win a 42" plasma tv or your very
own Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Flamerobin-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/flamerobin-devel
Reply | Threaded
Open this post in threaded view
|

Re: Slow when processing multiple insert statements

Milan Babuskov-2
Michael Hieke wrote:
> For more statements the execution part will take less and less of the
> time, and the housekeeping will take more and more.  Obvious points that
> need changing:
>
> - wxStyledTextCtrl::GotoPos() -> needs only be called when everything is
> done, not in each log() call.

The initial idea was to be able to see "how it's going" but user never
gets that, so I agree.

> - wxStringBase::find() -> we NEED a decent parser, sooner rather than
> later.

Sure.

> The whole way FR treats statement text is not fitting for larger amounts
> of text.

Well, that was never a requirement I had on my mind while writing it. If
I did, I would surely write a different code.

> Comments need to be scanned over, not removed from statements.

I know, I know ;)

> There are enough parsers around to look at.  There are tools to create
> them.  We might even simply take a parser class from Vulcan project and
> be done with it.

I generally agree with this, except regarding the "simply"-ness. What
tools are needed to build a parser? IIRC, it isn't pure C++ anymore, but
some tool like yacc or byson. I thought about this myself few times, but
it seemed pretty complicated.

--
Milan Babuskov
http://fbexport.sourceforge.net
http://www.flamerobin.org



-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server.
Download it for free - -and be entered to win a 42" plasma tv or your very
own Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Flamerobin-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/flamerobin-devel