Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plugin API for Scheduled Tasks #1184

Closed
robocoder opened this issue Mar 1, 2010 · 18 comments
Closed

Plugin API for Scheduled Tasks #1184

robocoder opened this issue Mar 1, 2010 · 18 comments
Assignees
Labels
Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc.
Milestone

Comments

@robocoder
Copy link
Contributor

Use one crontab entry to trigger Piwik archiving, daily report generation, bots, etc.

This plugin:

  • exposes a new hook for other plugins to register and run some scheduled processing when called
  • either provides a helper function for other plugins to know if they should run, passing a crontab-like schedule, e.g., isItTimeToRun('* * * * *'); or adds a method to Piwik_Plugin that returns a crontab-like schedule, e.g, getSchedule() that can be evaluated

Updates the UI Settings 'general settings'

  • Reports on automatic maintenance is working or not
    • cron not detected in the last 24 hours but tracker maintenance triggered
    • cron is detected every 10s-1h?
    • Maintenance is not executed. Check that Piwik is tracking visitors.

This plugin is not #817.

@mattab
Copy link
Member

mattab commented Mar 5, 2010

See also #587 which could allow triggering these cron tab like tasks from piwik.php requests in case users don't setup automatic crontabs.

If automatic crontab is setup (which can be automatically detected by Piwik), then cron tabs tasks are not triggered by piwik.php (see #587)

@mattab
Copy link
Member

mattab commented Mar 5, 2010

I believe we should update the documentation and have the crontab fire more regularly, ie. every 15 minutes, in case some plugins need to run tasks more frequently. The standard archiving task would only trigger after config.ini.php > time_before_today_archive_considered_outdated seconds.

@mattab
Copy link
Member

mattab commented Apr 26, 2010

#5 and #53 are feature candidates for this hook

@mattab
Copy link
Member

mattab commented Apr 26, 2010

We need to think about the current archive.sh script and how it would be changed to accomodate this new hook (either call this plugin specifically, or change the way archive.sh work to make it call this plugin that would trigger archiving?). Note that it might be better to leave archive.sh with the current "looping over websites and periods" to archive them separately because otherwise, triggering all archives at once will result in memory issues for Piwik installs with hundreds/thousands websites.

@mattab
Copy link
Member

mattab commented Apr 30, 2010

Also, do we need system to enforce that such task can not be ran twice at the same time (a software (or DB?) level lock mechanism).

@mattab
Copy link
Member

mattab commented Jun 17, 2010

Sending email reports is also candidate for this hook, see for example PDF plugin #71

@mattab
Copy link
Member

mattab commented Jul 12, 2010

Implementation proposal

  • new plugin TaskScheduler
  • exposes an API TaskScheduler.runTasks. This function triggers a hook, and existing plugins can register tasks to run.
// pseudo code of function hooking on runTasks
function runOptimizeTables($notification)
{
   // run every Mondays at 2AM
   if( TaskScheduler.shouldRunTask( 'my task ID name', 'weekly' ))
   { 
        // execute task
   }
}
  • Available schedules are hourly, daily, weekly, monthly

Note that we don't have minutes, because smaller possible granularity is the hour. (cron tabs are setup to run once per hour and probably should never run more often)

The difference between running scheduled tasks via cron or via piwik.php is that, it might be triggered more than once per hour (even though all requests to piwik.php will not trigger the Scheduled tasks, for obvious optimization reasons, only one random out of many will trigger scheduled tasks).

A solution to this issue is to plan for schedules ahead of time (process the time at which the task will run next). Then, when the task successfully runs, re-schedule it for next time (eg. next week for a weekly task)

pseudo code

function shouldTaskRun( taskID, interval, [ minimumTimestamp ] )
  if(minimumTimestamp > time()) return false;

  schedule = Piwik_GetOption('schedule')
  shouldRunTask = false;
  if(isset(schedule[taskID]))
  {
    // task already scheduled, run only if scheduled_time is > time()
    if(schedule[taskID]['scheduled_time'] > time())
    {
       shouldRunTask = true;
    } 
  }
  else
  {
     // new task, always run once first time cron is ran
     shouldRunTask = true;
  }

  // process next time at which should run
  nextScheduleTime = time() + (if hourly then 3600 elseif daily then 86400 etc.);
  schedule[taskID][scheduled_time] = nextScheduleTime;

  // record updated schedule in DB
  Piwik_SetOption('schedule', schedule);

  return shouldRunTask;

minimumTimestamp can be used to define exactly what time of day should tasks run.

For example, if one wants to run a daily job at 2AM, you would write in your plugin

if( TaskScheduler.shouldRunTask( 'my task ID name', 'weekly', mktime(2,0,0,date('m'),date('d'),date('Y'))  ))

What will happen is that, the first time the cron triggers after 2AM, this scheduled task will be allowed to run. ShouldRunTask will then process next time it should run, which is 2AM the next day.

Edge case: if the cron didn't run before 5AM (for some reasons), it will trigger the 2Am task. However you wouldnt want to schedule tomorrow's task at 5AM but at 2AM. You can use code such as

 now = time();
 interval = 86400; // for example
 nextScheduleTime = now + interval - ((now - minimumTimestamp) % $interval);

let me know if this makes sense, cheers

@mattab
Copy link
Member

mattab commented Jul 12, 2010

Note: inspired from WP implementation see
http://phpxref.ftwr.co.uk/wordpress/nav.html?wp-includes/cron.php.html#wp_schedule_event

http://phpxref.ftwr.co.uk/wordpress/nav.html?wp-cron.php.html

while their implementation is over complicated, we can do the same thing in a few lines of code :)

@julienmoumne
Copy link
Member

I'm ok with the proposal except for one bit.

I would like the implementation to be more object oriented.

There would be a Piwik_ScheduledTask, a Piwik_ScheduledTime.

Instead of having :

function getListHooksRegistered()
{
    return array(
        'TaskScheduler.getScheduledTasks' => 'runOptimizeTables',
    );
}

function runOptimizeTables($notification)
{
   // run every Mondays at 2AM
   if( TaskScheduler.shouldRunTask( 'my task ID name', 'weekly' ))
   { 
        // execute task
   }
}

it would be

function getListHooksRegistered()
{
    return array(
        'TaskScheduler.getScheduledTasks' => 'getScheduledTasks',
    );
}

function getScheduledTasks($notification)
{
    $scheduledTasks = &$notification->getNotificationObject();

    $tableOptimisationScheduledTime = Piwik_ScheduledTime::factory('weekly');
    $tableOptimisationScheduledTime->setDay('monday');
    $tableOptimisationScheduledTime->setHour(13);
    $tableOptimisationScheduledTime->setMinute(20);

    $scheduledTasks[] = new Piwik_ScheduledTask('runOptimizeTables', $tableOptimisationScheduledTime);

}

function runOptimizeTables()
{
    // execute task
}

@mattab
Copy link
Member

mattab commented Jul 14, 2010

  • setMinute() will not exist as the smaller granularity is the hour (when using cron, so we should limit it globally to the hour).
  • To call the actual method 'runOptimizeTables' from the TaskScheduler plugin, you would need to know the plugin name on which to call runOptimizeTables. You could pass the callback by doing array($this, 'runOptimizeTables') as first parameter and use call_user_func
  • the internal task ID could then be get_class( $callback[) . '.' . $callback1 (ie. Piwik_UserSettings.runSomeTasks

proposal looks good to me!

@julienmoumne
Copy link
Member

Attachment:
piwik-dev1 (#1184).patch

@julienmoumne
Copy link
Member

I have submitted a patch in which I decided to remove all modulo calculus in favor of easier to read and easier to maintain computations.

@mattab
Copy link
Member

mattab commented Jul 24, 2010

(In [2648]) Fixes #1184 Great patch by Julien Moumne to add Scheduled Task API in Piwik

  • possibilty to schedule daily/weekly/monthly tasks
  • tasks are executed via the crontab script for now (refs Archiving script: Port to Powershell #1411 should be updated to trigger the tasks as well)
  • features the first use case: a Monthly OPTIMIZE TABLE statement ran on all piwik archive tables (to defragment the space after we run the DELETE statements)
  • Next candidates: PDF reports by email, custom Alerts
  • comes his very serious unit testing

@mattab
Copy link
Member

mattab commented Jul 27, 2010

(In [2697]) Refs #5491

  • Adding PDF plugin, based on the submission from jeremy lavaux and Lyzun Oleksandr.
  • I rewrote nearly all code to comply with Piwik coding styles/guidelines/ etc. and also because it had to use the Metadata which wasn't yet created when the PDF code was initially written
  • Features customizable PDF reports (based on metadata reports), with description + scheduling (daily/weekly/monthly) and send to current user as well as additional emails listed
  • Added helper function Piwik::getCurrentUserEmail()
  • Refactored window modal popover into a helper method piwikHelper.windowModal (used to ask confirmation when deleting stuff) Refs New design, list of improvements and small issues #1490
  • Refactored the Goals CSS into generic CSS which can be reused and is reused for PDF UI
    Refs Plugin API for Scheduled Tasks #1184
  • The callback must pass $this instead of the class name as it
    TODO
  • test scheduled tasks send email properly (email looks good + attachment works)
  • Add comment header in PDFReports files
  • Can we remove some files in TCPDF which adds a lot of space in the release (eg. some fonts in libs/fonts ?)
  • Test PDF with UTF8 characters

@mattab
Copy link
Member

mattab commented Jul 28, 2010

(In [2737]) Refs #5491

  • Scheduled PDF reports by email work as expected
    • fixed issue with current week used instead of last finished period,
    • fixed issue that all recipients were listed in the same TO: field, now sending one email per address.
    • Super user API methods will return all PDF reports by default, but UI now only displays PDF created by Super User.
  • Refs Plugin API for Scheduled Tasks #1184 Better logging of what task was ran and how long it took
  • The API call to run scheduled tasks must also be ported to Powershell refs Archiving script: Port to Powershell #1411

@anonymous-matomo-user
Copy link

is there a possibility to schedule PDF reports without the usage of the crontab?
For me it would be nice to run reports e.g. when somebody logs in because I do not have the possibility to create crontabs.

@mattab
Copy link
Member

mattab commented Jul 29, 2010

Beatgarantie, scheduled reports should work without crontab in 0.7. Requests to the Tracker will trigger scheduled tasks hourly. See #587 - let me know if it works for you

@anonymous-matomo-user
Copy link

@matt: OK, I will test.

It would the nice to see the PDF-template after switching to another tracked page via the website-dropdown.

@robocoder robocoder added this to the Piwik 0.8 - A Web Analytics platform milestone Jul 8, 2014
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc.
Projects
None yet
Development

No branches or pull requests

4 participants