Jak efektywnie radzisz sobie z migawkami ze znacznikami czasu Maven-3?

86

Teraz, gdy maven-3 zrezygnował z obsługi <uniqueVersion> false </uniqueVersion> dla artefaktów migawek, wydaje się, że naprawdę musisz używać SNAPSHOTS ze znacznikiem czasu. Szczególnie m2eclipse, który wewnętrznie używa maven 3, wydaje się być dotknięty tym problemem, migawki aktualizacji nie działają, gdy SNAPSHOTS nie są unikalne.

Wcześniej wydawało się, że najlepszą praktyką jest ustawienie wszystkich migawek na uniqueVersion = false

Teraz przejście na wersję ze znacznikami czasu nie wydaje się dużym problemem, w końcu są one zarządzane przez centralne repozytorium Nexusa, które jest w stanie usuwać stare migawki w regularnych interwałach.

Problem stanowią lokalne stacje robocze deweloperów. Ich lokalne repozytorium szybko nie rosną bardzo duże z unikalnych zdjęć.

Jak sobie z tym poradzić?

W tej chwili widzę następujące możliwe rozwiązania:

  • Poproś programistów o czyszczenie repozytorium w regularnych odstępach czasu (co prowadzi do wielu ilustracji, ponieważ usunięcie wszystkiego zajmuje dużo czasu, a nawet dłużej, aby pobrać wszystko, co potrzebne)
  • Skonfiguruj skrypt, który usunie wszystkie katalogi SNAPSHOT z lokalnego repozytorium i od czasu do czasu poproś programistów o uruchomienie tego skryptu (lepiej niż pierwszy, ale nadal zajmuje trochę czasu, aby uruchomić i pobrać aktualne migawki)
  • użyj wtyczki zależności: purge-local-repository (ma problemy podczas uruchamiania z eclipse, z powodu otwartych plików, musi być uruchamiany z każdego projektu)
  • skonfiguruj nexus na każdej stacji roboczej i skonfiguruj zadanie czyszczenia starych migawek (najlepszy wynik, ale nie chcę utrzymywać ponad 50 serwerów Nexus, a na stacjach roboczych programistów jest zawsze mało pamięci)
  • przestań w ogóle używać SNAPSHOTS

Jaki jest najlepszy sposób, aby lokalne repozytorium nie zajmowało miejsca na dysku twardym?

Aktualizacja:

Aby zweryfikować zachowanie i udzielić więcej informacji, konfiguruję mały serwer Nexus, zbuduj dwa projekty (a i b) i spróbuj:

za:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>de.glauche</groupId>
  <artifactId>a</artifactId>
  <version>0.0.1-SNAPSHOT</version>
  <distributionManagement>
    <snapshotRepository>
        <id>nexus</id>
        <name>nexus</name>
        <url>http://server:8081/nexus/content/repositories/snapshots</url>
    </snapshotRepository>
  </distributionManagement>

</project>

b:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>de.glauche</groupId>
  <artifactId>b</artifactId>
  <version>0.0.1-SNAPSHOT</version>
    <distributionManagement>
    <snapshotRepository>
        <id>nexus</id>
        <name>nexus</name>
        <url>http://server:8081/nexus/content/repositories/snapshots/</url>
    </snapshotRepository>
  </distributionManagement>
 <repositories>
    <repository>
        <id>nexus</id>
        <name>nexus</name>
        <snapshots>
            <enabled>true</enabled>
        </snapshots>
        <url>http://server:8081/nexus/content/repositories/snapshots/</url>
    </repository>
 </repositories>
  <dependencies>
    <dependency>
        <groupId>de.glauche</groupId>
        <artifactId>a</artifactId>
        <version>0.0.1-SNAPSHOT</version>
    </dependency>
  </dependencies>
</project>

Now, when i use maven and run "deploy" on "a", i'll have

a-0.0.1-SNAPSHOT.jar
a-0.0.1-20101204.150527-6.jar
a-0.0.1-SNAPSHOT.pom
a-0.0.1-20101204.150527-6.pom

in the local repository. With a new timestamp version each time i run the deploy target. The same happens when i try to update Snapshots from the nexus server (close "a" Project, delete it from local repository, build "b")

In an environment where lots of snapshots get build (think hudson server ...), the local reposioty fills up with old versions fast

Update 2:

To test how and why this is failing i did some more tests. Each test is run against clean everything (de/glauche gets delete from both machines and nexus)

  • mvn deploy with maven 2.2.1 :

local repository on machine A does contain snapshot.jar + snapshot-timestamp.jar

BUT: only one timestamped jar in nexus, metadata reads:

<?xml version="1.0" encoding="UTF-8"?>
<metadata>
  <groupId>de.glauche</groupId>
  <artifactId>a</artifactId>
  <version>0.0.1-SNAPSHOT</version>
  <versioning>
    <snapshot>
      <timestamp>20101206.200039</timestamp>

      <buildNumber>1</buildNumber>
    </snapshot>
    <lastUpdated>20101206200039</lastUpdated>
  </versioning>
</metadata>
  • run update dependencies (on machine B) in m2eclipse (embedded m3 final) -> local repository has snapshot.jar + snapshot-timestamp.jar :(
  • run package goal with external maven 2.2.1 -> local repository has snapshot.jar + snapshot-timestamp.jar :(

Ok, next try with maven 3.0.1 (after removing all traces of project a)

  • local repository on machine A looks better, only one one non-timestamped jar

  • only one timestamped jar in nexus, metadata reads:

    de.glauche a 0.0.1-SNAPSHOT

    <snapshot>
      <timestamp>20101206.201808</timestamp>
      <buildNumber>3</buildNumber>
    </snapshot>
    <lastUpdated>20101206201808</lastUpdated>
    <snapshotVersions>
      <snapshotVersion>
        <extension>jar</extension>
        <value>0.0.1-20101206.201808-3</value>
        <updated>20101206201808</updated>
      </snapshotVersion>
      <snapshotVersion>
        <extension>pom</extension>
        <value>0.0.1-20101206.201808-3</value>
        <updated>20101206201808</updated>
      </snapshotVersion>
    </snapshotVersions>
    

  • run update dependencies (on machine B) in m2eclipse (embedded m3 final) -> local repository has snapshot.jar + snapshot-timestamp.jar :(

  • run package goal with external maven 2.2.1 -> local repository has snapshot.jar + snapshot-timestamp.jar :(

So, to recap: The "deploy" goal in maven3 works better than in 2.2.1, the local repository on the creating machine looks fine. But, the receiver always ends up with lots of timestamed versions ...

What am i doing wrong ?

Update 3

I also did test various other configurations, first replace nexus with artifactory -> same behaviour. Then use linux maven 3 clients to download the snapshots from the repository manager -> local repository still has timestamped snapshots :(

mglauche
źródło
Related question, about only the local .m2\repository part, focused on the local repository on a (Jenkins) build server: stackoverflow.com/q/9729076/223837.
MarnixKlooster ReinstateMonica
Here is working link to the Apcahe Maven Comptability Notes - cwiki.apache.org/confluence/display/MAVEN/…
aka_sh

Odpowiedzi:

36

The <uniqueVersion> configuration applied to artifacts that were deployed (via mvn deploy) to a Maven repository such as Nexus.

To remove these from Nexus, you can easily create an automated job to purge the SNAPSHOT repository every day. It can be configured to retain a certain number of shapshots or keep them for a certain period of time. Its super easy and works great.

Artifacts in the local repository on a developer machine get there from the "install" goal and do not use these timestamps...they just keep replacing the one and only SNAPSHOT version unless you are also incrementing the revision number (e.g. 1.0.0-SNAPSHOT to 1.0.1-SNAPSHOT).

HDave
źródło
1
The problem is, that the "install" goal is not so much use in a distributed enviromente with many developers. We also use a hudson server who does build (and deploy) new snapshots on every cvs commit, which happens quite often each day. I knew about the nexus snapshot delete mechainsm, see the list of possible workarounds.
mglauche
Each development machine should have a "local" repository under ~/.m2/repository and each pom.xml should have a repository definition that points to a single instance of Nexus on your LAN. (just like you show). We have this set up, along with Hudson that builds on every Subversion commit and it works great. SNAPSHOT builds are "deploy"ed to Nexus where they collect and are purged weekly. Developer machines automatically download the latest SNAPSHOT from Nexus to ~/.m2/repository and it replaces the previously downloaded one. Developers should never have their own Nexus instance.
HDave
2
I just read your update and have one more thing to add: The timestamped artifacts should never be see inside your local (~/.m2/repository) repository. If they are, something is wrong. They should only be seen inside of Nexus. Inside Nexus, yes, they collect quickly. Hundreds of MBs a day potentially. A nexus job can easier purge these daily to keep the quantity small.
HDave
6
They definitely do end up in the local repository (the ~/.m2/repository one), they end up there after running the "deploy" target and on mvn -U install on the depending project (i.e. the B project). I even did test it with maven 2.2.1 and maven 3, both have the same behaviour.
mglauche
2
I think I get it now...they do NOT appear there when the develop does a "deploy", but rather when the developer builds a dependent project. At that time, the upstream project's latest SNAPSHOT is downloaded from Nexus to the ~/.m2/repository with the timestamp left intact as part of the file name. Is this right?
HDave
13

This plugin removes project's artifacts from local repository. Useful to keep only one copy of large local snapshot.

<plugin>         
    <groupId>org.codehaus.mojo</groupId>         
    <artifactId>build-helper-maven-plugin</artifactId>         
    <version>1.7</version>         
    <executions>           
        <execution>             
            <id>remove-old-artifacts</id>             
            <phase>package</phase>             
            <goals>               
                <goal>remove-project-artifact</goal>             
            </goals>            
            <configuration>  
                <removeAll>true</removeAll><!-- When true, remove all built artifacts including all versions. When false, remove all built artifacts of this project version -->             
            </configuration>          
        </execution>         
    </executions>       
</plugin>
Cathy
źródło
7

Well I didn't like any of proposed solutions. Deleting maven cache often significantly increases network traffic and slows down build process. build-helper-maven-plugin helps only with one artifact, I wanted solution that can purge all outdated timestamped snapshot artifacts from local cache in one simple command. After few days of searching, I gave up and decided to write small program. The final program seems to be working quite well in our environment. So I decided to share it with others who may need such tool. Sources can be pulled from github: https://github.com/nadestin/tools/tree/master/MavenCacheCleanup

yurinadestin
źródło
@HDave I didn't manage to properly format pom fragment here, check it at https://github.com/nadestin/tools/wiki/m2cachecleanup-maven-plugin. On our Jenkins slaves this utility reclaims ~200Mb of disk space daily.
yurinadestin
2

As far as the remote repository piece of this, I think the previous answers that discuss a purging of SNAPSHOTs on a regular interval will work. But no one has addressed the local-developer workstation synchronization part of your question.

We have not started using Maven3 yet, so we've yet to see SNAPSHOTs starting to build up on local machines.

But we have had different problems with m2eclipse. When we have "Workspace Resolution" enabled and the project exists within our workspace, source updates usually keep us on the bleeding edge. But we've found it's very difficult to get m2eclipse to update itself with recently published artifacts in Nexus. We're experiencing similar problems within our team and it's particularly problematic because we have a very large project graph... there are a lot of dependencies that won't be in your workspace but will be getting SNAPSHOTs published frequently.

I'm pretty sure this boils back to an issue in m2eclipse where it doesn't handle SNAPSHOTs exactly as it should. You can see in the Maven console within eclipse where m2eclipse tells you it's skipping the update of a recently published SNAPSHOT because it's got a cached version. If you do a -U from a run configuration or from the command line, Maven will pick up the metadata change. But an "Update Snapshots..." selection should tell m2eclipse to have Maven expire this cache. It doesn't appear to be getting passed along. There appears to be a bug out there that is filed for this if you're interested in voting for it: https://issues.sonatype.org/browse/MNGECLIPSE-2608

You made mention of this in a comment somewhere.

The best workaround for this problem seems to be having developers purge their local workstations when things start to break down from within m2eclipse. Similar solution to a different problem... Others have reported problems with Maven 2.2.1 and 3 backing m2eclipse, and I've seen the same.

I would hope if you're using Maven3 you can configure it to only pull the latest SNAPSHOT, and cache that for the amount of time the repository says (or until you expire it by hand). Hopefully then you won't need to have a bunch of SNAPSHOTs sitting in your local repository.

That is unless you're talking about a build server that is manually doing a mvn install on them. As far as how to prevent SNAPSHOTs from building up on an environment like a build server, we've kind of dodged that bullet by having each build use its own workspace and local repository (though, in Maven 2.2.1, certain things such as POMs seem to always come out of the ~/.m2/repository) The extra SNAPSHOTs really only stick around for a single build and then they get dropped (and downloaded again from scratch). So we've seen this approach does end up eating up more space to begin with, but it tends to remain more stable than having everything resolved out of a single repository. This option (on Hudson) is called "Use private Maven repository" and is under the Advanced button of the Build section on project configurations when you've selected to build with Maven. Here is the help description for that option:

Normally, Hudson uses the local Maven repository as determined by Maven — the exact process seems to be undocumented, but it's ~/.m2/repository and can be overridden by in ~/.m2/settings.xml (see the reference for more details.) This normally means that all the jobs that are executed on the same node shares a single Maven repository. The upside of this is that you can save the disk space, but the downside of this is that sometimes those builds could interfere with each other. For example, you might end up having builds incorrectly succeed, just because your have all the dependencies in your local repository, despite that fact that none of the repositories in POM might have them.

There are also some reported problems regarding having concurrent Maven processes trying to use the same local repository.

When this option is checked, Hudson will tell Maven to use $WORKSPACE/.repository as the local Maven repository. This means each job will get its own isolated Maven repository just for itself. It fixes the above problems, at the expense of additional disk space consumption.

When using this option, consider setting up a Maven artifact manager so that you don't have to hit remote Maven repositories too often.

If you'd prefer to activate this mode in all the Maven jobs executed on Hudson, refer to the technique described here.

Hope this helps - if it doesn't address your problem please let me know where I missed.

cwash
źródło
The bug mentioned above has been fixed: bugs.eclipse.org/bugs/show_bug.cgi?id=339527
HDave
1

In groovy, deleting timestamped files like artifact-0.0.1-20101204.150527-6.jar can be very simple:

root = 'path to your repository'

new File(root).eachFileRecurse {
  if (it.name.matches(/.*\-\d{8}\.\d{6}\-\d+\.[\w\.]+$/)) {
    println 'Deleting ' + it.name
    it.delete()
  }
}

Install Groovy, save the script into a file and schedule the execution at each week, start, logon, whatever suits you.

Or, you can even wire the execution into maven build, using gmavenplus-plugin. Notice, how is the repository location set by maven into the property settings.localRepository and then binded through configuration into variable repository:

  <plugin>
    <groupId>org.codehaus.gmavenplus</groupId>
    <artifactId>gmavenplus-plugin</artifactId>
    <version>1.3</version>
    <executions>
      <execution>
        <phase>install</phase>
        <goals>
          <goal>execute</goal>
        </goals>
      </execution>
    </executions>
    <configuration>
      <properties>
        <property>
          <name>repository</name>
          <value>${settings.localRepository}</value>
        </property>
      </properties>
      <scripts>
        <script><![CDATA[
          new File(repository).eachFileRecurse {
            if (it.name.matches(/.*\-\d{8}\.\d{6}\-\d+\.[\w\.]+$/)) {
              println 'Deleting snapshot ' + it.getAbsolutePath()
              it.delete()
            }
          }
        ]]></script>
      </scripts>
    </configuration>
    <dependencies>
      <dependency>
        <groupId>org.codehaus.groovy</groupId>
        <artifactId>groovy-all</artifactId>
        <version>2.3.7</version>
        <scope>runtime</scope>
      </dependency>
    </dependencies>
  </plugin>  
vnov
źródło
0

Add following parameter into your POM file

POM

<configuration>
<outputAbsoluteArtifactFilename>true</outputAbsoluteArtifactFilename>
</configuration>

https://maven.apache.org/plugins/maven-dependency-plugin/copy-mojo.html

POM example

<plugins>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-dependency-plugin</artifactId>
        <version>2.10</version>
        <executions>
          <execution>
            <id>copy</id>
            <phase>package</phase>
            <goals>
              <goal>copy</goal>
            </goals>
            <configuration>
              <artifactItems>
                <artifactItem>
                  <groupId>junit</groupId>
                  <artifactId>junit</artifactId>
                  <version>3.8.1</version>
                  <type>jar</type>
                  <overWrite>false</overWrite>
                  <outputDirectory>${project.build.directory}/alternateLocation</outputDirectory>
                  <destFileName>optional-new-name.jar</destFileName>
                </artifactItem>
              </artifactItems>
              **<outputAbsoluteArtifactFilename>true</outputAbsoluteArtifactFilename>**
              <outputDirectory>${project.build.directory}/wars</outputDirectory>
              <overWriteReleases>false</overWriteReleases>
              <overWriteSnapshots>true</overWriteSnapshots>
            </configuration>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>

Configure in Jenkins:

// copy artifact 
copyMavenArtifact(artifact: "commons-collections:commons-collections:3.2.2:jar", outputAbsoluteArtifactFilename: "${pwd()}/target/my-folder/commons-collections.jar")
vaquar khan
źródło